[lustre-discuss] Suspended jobs and rebooting lustre servers

Raj Ayyampalayam ansraj at gmail.com
Thu Feb 21 21:04:40 PST 2019


Got it. I rather be safe than sorry. This is my first time doing a lustre
configuration change.

Raj

On Thu, Feb 21, 2019, 11:55 PM Raj <rajgautam at gmail.com> wrote:

> I also agree with Colin's comment.
> If the current OSTs are not touched, and you are only adding new OSTs to
> existing OSS nodes and adding new ost-mount resources in your existing
> (already running) Pacemaker configuration, you can achieve the upgrade with
> no downtime. If your Corosync-Pacemaker configuration is working correctly,
> you can failover and failback and take turn to reboot each OSS nodes. But,
> chances of human error is too high in doing this.
>
> On Thu, Feb 21, 2019 at 10:30 PM Raj Ayyampalayam <ansraj at gmail.com>
> wrote:
>
>> Hi Raj,
>>
>> Thanks for the explanation. We will have to rethink our upgrade process.
>>
>> Thanks again.
>> Raj
>>
>> On Thu, Feb 21, 2019, 10:23 PM Raj <rajgautam at gmail.com> wrote:
>>
>>> Hello Raj,
>>> It’s best and safe to unmount from all the clients and then do the
>>> upgrade. Your FS is getting more OSTs and changing conf in the existing
>>> ones, your client needs to get the new layout by remounting it.
>>> Also you mentioned about client eviction, during eviction the client has
>>> to drop it’s dirty pages and all the open file descriptors in the FS will
>>> be gone.
>>>
>>> On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam <ansraj at gmail.com>
>>> wrote:
>>>
>>>> What can I expect to happen to the jobs that are suspended during the
>>>> file system restart?
>>>> Will the processes holding an open file handle die when I unsuspend
>>>> them after the filesystem restart?
>>>>
>>>> Thanks!
>>>> -Raj
>>>>
>>>>
>>>> On Thu, Feb 21, 2019 at 12:52 PM Colin Faber <cfaber at gmail.com> wrote:
>>>>
>>>>> Ah yes,
>>>>>
>>>>> If you're adding to an existing OSS, then you will need to reconfigure
>>>>> the file system which requires writeconf event.
>>>>>
>>>>
>>>>> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam <ansraj at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> The new OST's will be added to the existing file system (the OSS
>>>>>> nodes are already part of the filesystem), I will have to re-configure the
>>>>>> current HA resource configuration to tell it about the 4 new OST's.
>>>>>> Our exascaler's HA monitors the individual OST and I need to
>>>>>> re-configure the HA on the existing filesystem.
>>>>>>
>>>>>> Our vendor support has confirmed that we would have to restart the
>>>>>> filesystem if we want to regenerate the HA configs to include the new OST's.
>>>>>>
>>>>>> Thanks,
>>>>>> -Raj
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber <cfaber at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> It seems to me that steps may still be missing?
>>>>>>>
>>>>>>> You're going to rack/stack and provision the OSS nodes with new
>>>>>>> OSTs'.
>>>>>>>
>>>>>>> Then you're going to introduce failover options somewhere? new osts?
>>>>>>> existing system? etc?
>>>>>>>
>>>>>>> If you're introducing failover with the new OST's and leaving the
>>>>>>> existing system in place, you should be able to accomplish this without
>>>>>>> bringing the system offline.
>>>>>>>
>>>>>>> If you're going to be introducing failover to your existing system
>>>>>>> then you will need to reconfigure the file system to accommodate the new
>>>>>>> failover settings (failover nides, etc.)
>>>>>>>
>>>>>>> -cf
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam <ansraj at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Our upgrade strategy is as follows:
>>>>>>>>
>>>>>>>> 1) Load all disks into the storage array.
>>>>>>>> 2) Create RAID pools and virtual disks.
>>>>>>>> 3) Create lustre file system using mkfs.lustre command. (I still
>>>>>>>> have to figure out all the parameters used on the existing OSTs).
>>>>>>>> 4) Create mount points on all OSSs.
>>>>>>>> 5) Mount the lustre OSTs.
>>>>>>>> 6) Maybe rebalance the filesystem.
>>>>>>>> My understanding is that the above can be done without bringing the
>>>>>>>> filesystem down. I want to create the HA configuration (corosync and
>>>>>>>> pacemaker) for the new OSTs. This step requires the filesystem to be down.
>>>>>>>> I want to know what would happen to the suspended processes across the
>>>>>>>> cluster when I bring the filesystem down to re-generate the HA configs.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Raj
>>>>>>>>
>>>>>>>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber <cfaber at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Can you provide more details on your upgrade strategy? In some
>>>>>>>>> cases expanding your storage shouldn't impact client / job activity at all.
>>>>>>>>>
>>>>>>>>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam <ansraj at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> We are planning on expanding our storage by adding more OSTs to
>>>>>>>>>> our lustre file system. It looks like it would be easier to expand if we
>>>>>>>>>> bring the filesystem down and perform the necessary operations. We are
>>>>>>>>>> planning to suspend all the jobs running on the cluster. We originally
>>>>>>>>>> planned to add new OSTs to the live filesystem.
>>>>>>>>>>
>>>>>>>>>> We are trying to determine the potential impact to the suspended
>>>>>>>>>> jobs if we bring down the filesystem for the upgrade.
>>>>>>>>>> One of the questions we have is what would happen to the
>>>>>>>>>> suspended processes that hold an open file handle in the lustre file system
>>>>>>>>>> when the filesystem is brought down for the upgrade?
>>>>>>>>>> Will they recover from the client eviction?
>>>>>>>>>>
>>>>>>>>>> We do have vendor support and have engaged them. I wanted to ask
>>>>>>>>>> the community and get some feedback.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> -Raj
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>>> lustre-discuss mailing list
>>>>>>>>>> lustre-discuss at lists.lustre.org
>>>>>>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190222/6aa0476c/attachment-0001.html>


More information about the lustre-discuss mailing list