[lustre-discuss] Replacing ldiskfs MDT with larger disk
Jesse Stroik
jesse.stroik at ssec.wisc.edu
Mon Aug 5 09:49:31 PDT 2019
Ah, nevermind. It appears that this can be done if 'lfs migrate -m' is
used directly instead of the lfs_migrate script.
Best,
Jesse
On 8/5/19 11:26 AM, Jesse Stroik wrote:
>
> On 7/31/19 6:27 PM, Andreas Dilger wrote:
>> Just to clarify, when I referred to "file level backup/restore", I was
>> referring to the MDT ldiskfs filesystem, not the whole Lustre
>> filesystem (which would be _much_ too large for most sites. The
>> various backup/restore methods are documented in the Lustre Operations
>> Manual.
>
>
> Yes - I sometimes copy file systems and I typically do so as cluster
> jobs so I can adjust the rate of the copy. But that requires having a
> spare petabytes available ;)
>
> I created mdt --index=1 for DNE and ran into an issue. I had assumed lfs
> setdirstripe works like lfs setstripe on existing directories so that
> newly created files would be assigned to the new MDT.
>
> However, setdirstripe is alias for lfs mkdir so I cannot change the MDT
> setting on existing directories. I planned to change the MDT setting on
> the directories and use lfs_migrate in the background to effect the
> migration so it would be transparent to the end users.
>
> Is there a better way to migrate use to the new MDT than recreating all
> of the directories?
>
> Jesse
>
>
>
>
>> Cheers, Andreas
>>
>>> On Jul 31, 2019, at 15:10, Jesse Stroik <jesse.stroik at ssec.wisc.edu>
>>> wrote:
>>>
>>> This is excellent information, Andreas.
>>>
>>> Presently we do file level backups to the live file system and they
>>> take over 24 hours, so they're done continuously. For that timeframe
>>> to wrok, we'd need to be able to back up and recover the MDT to the
>>> new MDT with the file system online.
>>>
>>> Given that resizing the file system will proportionately increase the
>>> inodes (I didn't realize that), dd to a logical volume may be a
>>> reasonable option for us. The dd would be faster enough that we could
>>> weather the downtime.
>>>
>>> PFL and FLR aren't features they're planning for the file system and
>>> it may be replaced next year so I suspect they'll opt for the DNE
>>> method.
>>>
>>> Thanks again,
>>> Jesse Stroik
>>>
>>> On 7/31/19 3:11 PM, Andreas Dilger wrote:
>>>> Normally the easy answer would be that a "dd" copy of the MDT device
>>>> from your HDDs to a larger SSD LUN, then resize2fs to increase the
>>>> filesystem size would also increase the number of inodes
>>>> proportionately to the LUN size.
>>>> However, since you are *not* using 1024-byte inode size, only
>>>> 512-byte inode size + 512-bytes space for other things (ie. 1024
>>>> bytes-per-inode ratio), I'd suggest a file-level MDT backup/restore
>>>> to a newly-formatted MDT because newer features like PFL and FLR
>>>> need more space in the inode itself. The benefit of this approach is
>>>> that you keep a full backup of the MDT on the HDDs in case of
>>>> problems. Note that after backup/restore the LFSCK OI Scrub will
>>>> run for some time (maybe an hour or two, depending on size), which
>>>> will result in slowdown. That would likely be compensated by faster
>>>> SSD storage.
>>>> If you go the DNE route, then migrate some of the namespace to the
>>>> new MDT, you definitely still need to keep MDT0000. However, you
>>>> could combine these approaches and still copy MDT0000 to new flash
>>>> storage instead of keeping the HDDs around forever. I'd again
>>>> recommend a file-level MDT backup/restore to a newly-formatted MDT
>>>> to get the newer format options.
>>>> Cheers, Andreas
>>>>> On Jul 31, 2019, at 13:50, Jesse Stroik
>>>>> <jesse.stroik at ssec.wisc.edu> wrote:
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> One of our lustre file systems outgrew its MDT and the original
>>>>> scope of its operation. This one is still running ldiskfs on the
>>>>> MDT. Here's our setup and restrictions:
>>>>>
>>>>> - centos 6 / lustre 2.8
>>>>> - ldiskfs MDT
>>>>> - minimal downtime allowed, but the FS can be read-only for a while.
>>>>>
>>>>> The MDT itself, set up with -i 1024, needs both more space and
>>>>> available inodes. Its purpose changed in scope and we'd now like
>>>>> the performance benefits of getting off of spinning media as well.
>>>>>
>>>>> We need a new files system instead of expanding the existing
>>>>> ldiskfs because we need more inodes.
>>>>>
>>>>> I think my options are (1) a file level backup and recovery or
>>>>> direct copy onto the new file system or (2) add a new MDT to the
>>>>> system and assign all directories under the root to it, then
>>>>> lfs_migrate everything on the file system thereafter.
>>>>>
>>>>> Is there a disadvantage to the DNE approach other than the fact
>>>>> that we have to keep the original spinning-disk MDT around to
>>>>> service the root of the FS?
>>>>>
>>>>> If we had to do option 1, we'd want to remount the current MDT read
>>>>> only and continue using it while we were preparing new MDT. When I
>>>>> searched, I couldn't find anything that seemed definitive about
>>>>> ensuring no changes to an ldiskfs MDT during operation and I don't
>>>>> want to assume i can simply remount it read only.
>>>>>
>>>>> Thanks,
>>>>> Jesse Stroik
>>>>>
>>>>> _______________________________________________
>>>>> lustre-discuss mailing list
>>>>> lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Principal Lustre Architect
>> Whamcloud
>>
>>
>>
>>
>>
>>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3964 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190805/e8ccb35e/attachment-0001.bin>
More information about the lustre-discuss
mailing list