hugetlb: fix memory leak associated with vma_lock structure
authorMike Kravetz <mike.kravetz@oracle.com>
Wed, 19 Oct 2022 20:19:57 +0000 (13:19 -0700)
committerAndrew Morton <akpm@linux-foundation.org>
Fri, 21 Oct 2022 04:27:23 +0000 (21:27 -0700)
The hugetlb vma_lock structure hangs off the vm_private_data pointer of
sharable hugetlb vmas.  The structure is vma specific and can not be
shared between vmas.  At fork and various other times, vmas are duplicated
via vm_area_dup().  When this happens, the pointer in the newly created
vma must be cleared and the structure reallocated.  Two hugetlb specific
routines deal with this hugetlb_dup_vma_private and hugetlb_vm_op_open.
Both routines are called for newly created vmas.  hugetlb_dup_vma_private
would always clear the pointer and hugetlb_vm_op_open would allocate the
new vms_lock structure.  This did not work in the case of this calling
sequence pointed out in [1].

  move_vma
    copy_vma
      new_vma = vm_area_dup(vma);
      new_vma->vm_ops->open(new_vma); --> new_vma has its own vma lock.
    is_vm_hugetlb_page(vma)
      clear_vma_resv_huge_pages
        hugetlb_dup_vma_private --> vma->vm_private_data is set to NULL

When clearing hugetlb_dup_vma_private we actually leak the associated
vma_lock structure.

The vma_lock structure contains a pointer to the associated vma.  This
information can be used in hugetlb_dup_vma_private and hugetlb_vm_op_open
to ensure we only clear the vm_private_data of newly created (copied)
vmas.  In such cases, the vma->vma_lock->vma field will not point to the
vma.

Update hugetlb_dup_vma_private and hugetlb_vm_op_open to not clear
vm_private_data if vma->vma_lock->vma == vma.  Also, log a warning if
hugetlb_vm_op_open ever encounters the case where vma_lock has already
been correctly allocated for the vma.

[1] https://lore.kernel.org/linux-mm/5154292a-4c55-28cd-0935-82441e512fc3@huawei.com/

Link: https://lkml.kernel.org/r/20221019201957.34607-1-mike.kravetz@oracle.com
Fixes: 131a79b474e9 ("hugetlb: fix vma lock handling during split vma and range unmapping")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/hugetlb.c

index dede033..546df97 100644 (file)
@@ -1014,15 +1014,23 @@ void hugetlb_dup_vma_private(struct vm_area_struct *vma)
        VM_BUG_ON_VMA(!is_vm_hugetlb_page(vma), vma);
        /*
         * Clear vm_private_data
+        * - For shared mappings this is a per-vma semaphore that may be
+        *   allocated in a subsequent call to hugetlb_vm_op_open.
+        *   Before clearing, make sure pointer is not associated with vma
+        *   as this will leak the structure.  This is the case when called
+        *   via clear_vma_resv_huge_pages() and hugetlb_vm_op_open has already
+        *   been called to allocate a new structure.
         * - For MAP_PRIVATE mappings, this is the reserve map which does
         *   not apply to children.  Faults generated by the children are
         *   not guaranteed to succeed, even if read-only.
-        * - For shared mappings this is a per-vma semaphore that may be
-        *   allocated in a subsequent call to hugetlb_vm_op_open.
         */
-       vma->vm_private_data = (void *)0;
-       if (!(vma->vm_flags & VM_MAYSHARE))
-               return;
+       if (vma->vm_flags & VM_MAYSHARE) {
+               struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
+
+               if (vma_lock && vma_lock->vma != vma)
+                       vma->vm_private_data = NULL;
+       } else
+               vma->vm_private_data = NULL;
 }
 
 /*
@@ -4601,6 +4609,7 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
        struct resv_map *resv = vma_resv_map(vma);
 
        /*
+        * HPAGE_RESV_OWNER indicates a private mapping.
         * This new VMA should share its siblings reservation map if present.
         * The VMA will only ever have a valid reservation map pointer where
         * it is being copied for another still existing VMA.  As that VMA
@@ -4615,11 +4624,21 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
 
        /*
         * vma_lock structure for sharable mappings is vma specific.
-        * Clear old pointer (if copied via vm_area_dup) and create new.
+        * Clear old pointer (if copied via vm_area_dup) and allocate
+        * new structure.  Before clearing, make sure vma_lock is not
+        * for this vma.
         */
        if (vma->vm_flags & VM_MAYSHARE) {
-               vma->vm_private_data = NULL;
-               hugetlb_vma_lock_alloc(vma);
+               struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
+
+               if (vma_lock) {
+                       if (vma_lock->vma != vma) {
+                               vma->vm_private_data = NULL;
+                               hugetlb_vma_lock_alloc(vma);
+                       } else
+                               pr_warn("HugeTLB: vma_lock already exists in %s.\n", __func__);
+               } else
+                       hugetlb_vma_lock_alloc(vma);
        }
 }