IB/qib: Convert qib_user_sdma_pin_pages() to use get_user_pages_fast()
authorJan Kara <jack@suse.cz>
Fri, 4 Oct 2013 13:29:12 +0000 (09:29 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 20 Feb 2014 18:45:33 +0000 (10:45 -0800)
commit 603e7729920e42b3c2f4dbfab9eef4878cb6e8fa upstream.

qib_user_sdma_queue_pkts() gets called with mmap_sem held for
writing. Except for get_user_pages() deep down in
qib_user_sdma_pin_pages() we don't seem to need mmap_sem at all.  Even
more interestingly the function qib_user_sdma_queue_pkts() (and also
qib_user_sdma_coalesce() called somewhat later) call copy_from_user()
which can hit a page fault and we deadlock on trying to get mmap_sem
when handling that fault.

So just make qib_user_sdma_pin_pages() use get_user_pages_fast() and
leave mmap_sem locking for mm.

This deadlock has actually been observed in the wild when the node
is under memory pressure.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Roland Dreier <roland@purestorage.com>
[Backported to 3.4: (Thank to Ben Hutchings)
 - Adjust context
 - Adjust indentation and nr_pages argument in qib_user_sdma_pin_pages()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/infiniband/hw/qib/qib_user_sdma.c

index 8244208..573b460 100644 (file)
@@ -284,8 +284,7 @@ static int qib_user_sdma_pin_pages(const struct qib_devdata *dd,
        int j;
        int ret;
 
-       ret = get_user_pages(current, current->mm, addr,
-                            npages, 0, 1, pages, NULL);
+       ret = get_user_pages_fast(addr, npages, 0, pages);
 
        if (ret != npages) {
                int i;
@@ -830,10 +829,7 @@ int qib_user_sdma_writev(struct qib_ctxtdata *rcd,
        while (dim) {
                const int mxp = 8;
 
-               down_write(&current->mm->mmap_sem);
                ret = qib_user_sdma_queue_pkts(dd, pq, &list, iov, dim, mxp);
-               up_write(&current->mm->mmap_sem);
-
                if (ret <= 0)
                        goto done_unlock;
                else {