[lustre-discuss] Full OST after reintroduction
Jon Marshall
Jon.Marshall at cruk.cam.ac.uk
Mon Jun 10 05:56:03 PDT 2024
Hi,
We had an issue a few months ago with the underlying zpool for one of our OSTs. I managed to get it mounted in read only mode and migrated all of the files off it with lfs migrate, then recreated the OST and reintroduced it. This all went pretty smoothly - at the same time I updated our progressive file layout using the following command:
lfs find . -type d -print0 | xargs -0 lfs setstripe -E 256M -c 1 -E eof -c -1
I then ran an lfs find to find all the files bigger than 256M and migrated them to this new layout.
I have since noticed that the OST that was reintroduced has been filling up more rapidly than the others, to the point where it is now full:
UUID bytes Used Available Use% Mounted on
scratchc-MDT0000_UUID 1.4T 108.0G 1.3T 8% /mnt/scratchc[MDT:0]
scratchc-OST0000_UUID 55.2T 55.2T 42.0M 100% /mnt/scratchc[OST:0]
scratchc-OST0001_UUID 55.2T 22.5T 32.7T 41% /mnt/scratchc[OST:1]
scratchc-OST0002_UUID 46.0T 19.3T 26.7T 43% /mnt/scratchc[OST:2]
scratchc-OST0003_UUID 46.0T 19.4T 26.6T 43% /mnt/scratchc[OST:3]
scratchc-OST0004_UUID 46.0T 19.5T 26.5T 43% /mnt/scratchc[OST:4]
scratchc-OST0005_UUID 55.2T 22.8T 32.5T 42% /mnt/scratchc[OST:5]
filesystem_summary: 303.8T 158.8T 145.0T 53% /mnt/scratchc
For reference, I marked the OST as inactive to migrate the files off by using the command:
lctl set_param osp.scratchc-OST0000-osc-MDT0000.max_create_count=0
As per the manual. To reactivate it after having rebuilt it, I copied the count from the other OSTs:
~]# lctl get_param osp.scratchc-*.max_create_count
osp.scratchc-OST0000-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0001-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0002-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0003-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0004-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0005-osc-MDT0000.max_create_count=20000
As far as I can tell I haven't told lustre to preferentially use the one OST, so I'm a little stumped as to why this has happened - it is possible that someone has changed the default layout on some of their folders but I'm struggling to think of a quick way of checking this.
Has anyone else run into similar problems? I'm hoping there is something incredibly obvious that I've missed somewhere!
Thanks in advance!
Jon Marshall
High Performance Computing Specialist
IT and Scientific Computing Team
Cancer Research UK Cambridge Institute
Li Ka Shing Centre | Robinson Way | Cambridge | CB2 0RE
Web<http://www.cruk.cam.ac.uk/> | Facebook<http://www.facebook.com/cancerresearchuk> | Twitter<http://twitter.com/CR_UK>
[Description: CRI Logo]<http://www.cruk.cam.ac.uk/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240610/fb6fb7d1/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-Descriptio.png
Type: image/png
Size: 22068 bytes
Desc: Outlook-Descriptio.png
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240610/fb6fb7d1/attachment-0001.png>
More information about the lustre-discuss
mailing list