<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Le 06/11/2018 à 04:11, NeilBrown a
écrit :<br>
</div>
<blockquote type="cite"
cite="mid:87wopqyk5h.fsf@notabene.neil.brown.name">
<pre wrap="">On Mon, Nov 05 2018, <a class="moz-txt-link-abbreviated" href="mailto:quentin.bouget@cea.fr">quentin.bouget@cea.fr</a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Le 04/11/2018 à 22:34, James Simmons a écrit :
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">Lockdep reports a possible deadlock between chlg_open() and
mdc_changelog_cdev_init()
mdc_changelog_cdev_init() takes chlg_registered_dev_lock and then
calls misc_register() which takes misc_mtx.
chlg_open() is called while misc_mtx is held, and tries to take
chlg_registered_dev_lock.
If these two functions race, a deadlock can occur as each thread will
hold one of the locks while trying to take the other.
chlg_open() does not need to take a lock. It only uses the
lock to stablize a list while looking for the matching
chlg_registered_dev, and this can be found directly by examining
file->private_data.
So remove chlg_obd_get(), and use file->private_data to find the
obd_device.
Also ensure the device is fully initialized before calling
misc_register(). This means setting up some list linkage before the
call, and tearing it down if there is an error.
</pre>
</blockquote>
<pre wrap="">I have been testing this but I'm no HSM expert. I pushed this patch
to OpenSFS branch as well.
<a class="moz-txt-link-freetext" href="https://jira.whamcloud.com/browse/LU-11617">https://jira.whamcloud.com/browse/LU-11617</a>
<a class="moz-txt-link-freetext" href="https://review.whamcloud.com/#/c/33572/">https://review.whamcloud.com/#/c/33572/</a>
Reviewed-by: James Simmons <a class="moz-txt-link-rfc2396E" href="mailto:jsimmons@infradead.org"><jsimmons@infradead.org></a>
</pre>
</blockquote>
<pre wrap="">
Reviewed-by: Quentin Bouget <a class="moz-txt-link-rfc2396E" href="mailto:quentin.bouget@cea.fr"><quentin.bouget@cea.fr></a>
</pre>
</blockquote>
<pre wrap="">
Thanks to you both for the review.
NeilBrown
</pre>
</blockquote>
<p>Wait! I just realised there might be another issue!<br>
I think there is now a race between chlg_open() and
mdc_changelog_cdev_finish().<br>
</p>
Wait! I just realised there might be another bigger issue!<br>
The whole "take the first obd you can find" is broken! I opened a <a
moz-do-not-send="true"
href="https://jira.whamcloud.com/browse/LU-11626">ticket</a> on
whamcloud's JIRA about it.<br>
<p>Quentin Bouget<br>
</p>
</body>
</html>