[lustre-discuss] Odd problem with new OSTs not being used
Carlson, Timothy S
Timothy.Carlson at pnnl.gov
Thu Sep 1 16:15:21 PDT 2016
Following up on my own email.
Looks like I triggered this bug
https://jira.hpdd.intel.com/browse/LU-5778
While all of the OSTs are listed as "UP", the reality is that 4 had been made INACTIVE for various reasons. Once I reactivated those OSTs, the empty OSTs began to take data. Looks like I will be upgrading to 2.5.4 soon as I really need to be able to deactivate OSTs and have the algorithm on the MDS still be able to choose new OSTs to write to.
Tim
-----Original Message-----
From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Carlson, Timothy S
Sent: Thursday, September 1, 2016 2:00 PM
To: lustre-discuss at lists.lustre.org
Subject: [lustre-discuss] Odd problem with new OSTs not being used
Running Lustre 2.5.3(ish) backed with ZFS.
We’ve added a few OSTs and they show as being “UP” but aren’t taking any data
[root at lzfs01a ~]# lctl dl
0 UP osd-zfs MGS-osd MGS-osd_UUID 5
1 UP mgs MGS MGS 1085
2 UP mgc MGC172.17.210.11 at o2ib9 77cf08da-86a4-7824-1878-84b540993c6d 5
3 UP osd-zfs lzfs-MDT0000-osd lzfs-MDT0000-osd_UUID 42
4 UP mds MDS MDS_uuid 3
5 UP lod lzfs-MDT0000-mdtlov lzfs-MDT0000-mdtlov_UUID 4
6 UP mdt lzfs-MDT0000 lzfs-MDT0000_UUID 1087
7 UP mdd lzfs-MDD0000 lzfs-MDD0000_UUID 4
8 UP qmt lzfs-QMT0000 lzfs-QMT0000_UUID 4
9 UP osp lzfs-OST0008-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
10 UP osp lzfs-OST0003-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
11 UP osp lzfs-OST0006-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
12 UP osp lzfs-OST0007-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
13 UP osp lzfs-OST0004-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
14 UP osp lzfs-OST000a-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
15 UP osp lzfs-OST0000-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
16 UP osp lzfs-OST0002-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
17 UP osp lzfs-OST0001-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
18 UP osp lzfs-OST0005-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
19 UP osp lzfs-OST0009-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
20 UP osp lzfs-OST000b-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
21 UP osp lzfs-OST000c-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
22 UP osp lzfs-OST000d-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
23 UP osp lzfs-OST0010-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
24 UP osp lzfs-OST000f-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
25 UP osp lzfs-OST000e-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
26 UP osp lzfs-OST0011-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
27 UP osp lzfs-OST0015-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
28 UP osp lzfs-OST0016-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
29 UP osp lzfs-OST0017-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
30 UP osp lzfs-OST0018-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
31 UP osp lzfs-OST0019-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
32 UP osp lzfs-OST001b-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
33 UP osp lzfs-OST0013-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
34 UP osp lzfs-OST0014-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
35 UP lwp lzfs-MDT0000-lwp-MDT0000 lzfs-MDT0000-lwp-MDT0000_UUID 5
36 UP osp lzfs-OST001c-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
37 UP osp lzfs-OST0012-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
38 UP osp lzfs-OST001a-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
39 UP osp lzfs-OST001d-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
40 UP osp lzfs-OST001e-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
41 UP osp lzfs-OST001f-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
42 UP osp lzfs-OST0020-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
43 UP osp lzfs-OST0021-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
44 UP osp lzfs-OST0022-osc-MDT0000 lzfs-MDT0000-mdtlov_UUID 5
Now if you look at devices 36 and higher, you’ll see that they don’t have much data even though they have been online for a few weeks and this is a fairly active file system. The data that is in there is data that I have “forced” onto the OSTs for testing by setting the stripe to that specific OST.
# lfs df
UUID 1K-blocks Used Available Use% Mounted on
lzfs-MDT0000_UUID 60762585216 262897152 60499686016 0% /pic[MDT:0]
lzfs-OST0000_UUID 90996712832 82190600320 8805795072 90% /pic[OST:0]
lzfs-OST0001_UUID 90996823936 82773737088 8221323776 91% /pic[OST:1]
lzfs-OST0002_UUID 90996723840 82547420928 8448555520 91% /pic[OST:2]
lzfs-OST0003_UUID 90996780416 82570822400 8425071872 91% /pic[OST:3]
lzfs-OST0004_UUID 90996792320 83526260096 7466092288 92% /pic[OST:4]
lzfs-OST0005_UUID 90996764544 83071284864 7922972800 91% /pic[OST:5]
lzfs-OST0006_UUID 90996729600 83348930304 7643451520 92% /pic[OST:6]
lzfs-OST0007_UUID 90996800000 82677238272 8314902016 91% /pic[OST:7]
lzfs-OST0008_UUID 90996910208 83598099584 7396038656 92% /pic[OST:8]
lzfs-OST0009_UUID 90997091328 85659415424 5335623552 94% /pic[OST:9]
lzfs-OST000a_UUID 90996807680 83581871872 7410268800 92% /pic[OST:10]
lzfs-OST000b_UUID 90996676352 77512128000 13484523136 85% /pic[OST:11]
lzfs-OST000c_UUID 90996505984 86176576256 4819325824 95% /pic[OST:12]
lzfs-OST000d_UUID 90997104256 90339916032 656510208 99% /pic[OST:13]
lzfs-OST000e_UUID 90996660480 86856594560 4134641792 95% /pic[OST:14]
lzfs-OST000f_UUID 90996441472 82859149568 8134773888 91% /pic[OST:15]
lzfs-OST0010_UUID 90996592896 88961102592 2034770816 98% /pic[OST:16]
lzfs-OST0011_UUID 90996459264 83005755520 7989576448 91% /pic[OST:17]
lzfs-OST0012_UUID 90999916800 1073280 90998828928 0% /pic[OST:18]
lzfs-OST0013_UUID 90996418560 83272862336 7716835328 92% /pic[OST:19]
lzfs-OST0014_UUID 90996442496 84503368320 6486773504 93% /pic[OST:20]
lzfs-OST0015_UUID 90996476416 82157845376 8831992320 90% /pic[OST:21]
lzfs-OST0016_UUID 90996456960 83149106688 7844745088 91% /pic[OST:22]
lzfs-OST0017_UUID 90996518912 84959371648 6032033408 93% /pic[OST:23]
lzfs-OST0018_UUID 90996448000 84187752448 6806393088 93% /pic[OST:24]
lzfs-OST0019_UUID 90996425344 85975606784 5012946816 94% /pic[OST:25]
lzfs-OST001a_UUID 90999916032 1920 90999912064 0% /pic[OST:26]
lzfs-OST001b_UUID 90996397312 82557086592 8435064576 91% /pic[OST:27]
lzfs-OST001c_UUID 90999915008 44549120 90955355520 0% /pic[OST:28]
lzfs-OST001d_UUID 90999917184 1920 90999913088 0% /pic[OST:29]
lzfs-OST001e_UUID 90999917312 2048 90999913216 0% /pic[OST:30]
lzfs-OST001f_UUID 90999917056 1920 90999903104 0% /pic[OST:31]
lzfs-OST0020_UUID 90999917312 2048 90999913216 0% /pic[OST:32]
lzfs-OST0021_UUID 90999917056 1920 90999903104 0% /pic[OST:33]
lzfs-OST0022_UUID 90999917312 2048 90999913216 0% /pic[OST:34]
filesystem summary: 3184912212480 2182065540096 1002764557568 69% /pic
Any ideas of something I need to trigger on the MDS to get files to being landing on the empty OSTs? Was thinking about rebooting the MDS, but wanted to use a smaller hammer to get this working.
Thanks!
Tim
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
More information about the lustre-discuss
mailing list