[lustre-devel] [PATCH v2] lustre: mdc: fix possible deadlock in chlg_open()

quentin.bouget at cea.fr quentin.bouget at cea.fr
Tue Nov 6 02:41:56 PST 2018


Le 06/11/2018 à 04:11, NeilBrown a écrit :
> On Mon, Nov 05 2018, quentin.bouget at cea.fr wrote:
>
>> Le 04/11/2018 à 22:34, James Simmons a écrit :
>>>> Lockdep reports a possible deadlock between chlg_open() and
>>>> mdc_changelog_cdev_init()
>>>>
>>>> mdc_changelog_cdev_init() takes chlg_registered_dev_lock and then
>>>> calls misc_register() which takes misc_mtx.
>>>> chlg_open() is called while misc_mtx is held, and tries to take
>>>> chlg_registered_dev_lock.
>>>> If these two functions race, a deadlock can occur as each thread will
>>>> hold one of the locks while trying to take the other.
>>>>
>>>> chlg_open() does not need to take a lock.  It only uses the
>>>> lock to stablize a list while looking for the matching
>>>> chlg_registered_dev, and this can be found directly by examining
>>>> file->private_data.
>>>>
>>>> So remove chlg_obd_get(), and use file->private_data to find the
>>>> obd_device.
>>>> Also ensure the device is fully initialized before calling
>>>> misc_register().  This means setting up some list linkage before the
>>>> call, and tearing it down if there is an error.
>>> I have been testing this but I'm no HSM expert. I pushed this patch
>>> to OpenSFS branch as well.
>>>
>>> https://jira.whamcloud.com/browse/LU-11617
>>> https://review.whamcloud.com/#/c/33572/
>>>
>>> Reviewed-by: James Simmons <jsimmons at infradead.org>
>> Reviewed-by: Quentin Bouget <quentin.bouget at cea.fr>
>>
> Thanks to you both for the review.
>
> NeilBrown
>
Wait! I just realised there might be another issue!
I think there is now a race between chlg_open() and 
mdc_changelog_cdev_finish().

Wait! I just realised there might be another bigger issue!
The whole "take the first obd you can find" is broken! I opened a ticket 
<https://jira.whamcloud.com/browse/LU-11626> on whamcloud's JIRA about it.

Quentin Bouget

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181106/bad9cdb2/attachment-0001.html>


More information about the lustre-devel mailing list