When user deletes a tcmu device via configFS, tcmu calls
uio_unregister_device(). During that call uio resets its pointer to struct
uio_info provided by tcmu. That means, after uio_unregister_device() uio
will no longer execute any of the callbacks tcmu had set in uio_info.
Especially, if userspace daemon still holds the corresponding uio device
open or mmap'ed while tcmu calls uio_unregister_device(), uio will not call
tcmu_release() when userspace finally closes and munmaps the uio device.
Since tcmu does refcounting for the tcmu device in tcmu_open() and
tcmu_release(), in the decribed case refcount does not drop to 0 and tcmu
does not free tcmu device's resources. In extreme cases this can cause
memory leaking of up to 1 GB for a single tcmu device.
After uio_unregister_device(), uio will reject every open, read, write,
mmap from userspace with -EOI. But userspace daemon can still access the
mmap'ed command ring and data area. Therefore tcmu should wait until
userspace munmaps the uio device before it frees the resources, as we don't
want to cause SIGSEGV or SIGBUS to user space.
That said, current refcounting during tcmu_open and tcmu_release does not
work correctly, and refcounting better should be done in the open and close
callouts of the vm_operations_struct, which tcmu assigns to each mmap of
the uio device (because it wants its own page fault handler).
This patch fixes the memory leak by removing refcounting from tcmu_open and
tcmu_close, and instead adding new tcmu_vma_open() and tcmu_vma_close()
handlers that only do refcounting.
Link: https://lore.kernel.org/r/20210218175039.7829-3-bostroesser@gmail.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
bitmap_free(udev->data_bitmap);
mutex_unlock(&udev->cmdr_lock);
+ pr_debug("dev_kref_release\n");
+
call_rcu(&dev->rcu_head, tcmu_dev_call_rcu);
}
return page;
}
+static void tcmu_vma_open(struct vm_area_struct *vma)
+{
+ struct tcmu_dev *udev = vma->vm_private_data;
+
+ pr_debug("vma_open\n");
+
+ kref_get(&udev->kref);
+}
+
+static void tcmu_vma_close(struct vm_area_struct *vma)
+{
+ struct tcmu_dev *udev = vma->vm_private_data;
+
+ pr_debug("vma_close\n");
+
+ /* release ref from tcmu_vma_open */
+ kref_put(&udev->kref, tcmu_dev_kref_release);
+}
+
static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
{
struct tcmu_dev *udev = vmf->vma->vm_private_data;
}
static const struct vm_operations_struct tcmu_vm_ops = {
+ .open = tcmu_vma_open,
+ .close = tcmu_vma_close,
.fault = tcmu_vma_fault,
};
if (vma_pages(vma) != (udev->ring_size >> PAGE_SHIFT))
return -EINVAL;
+ tcmu_vma_open(vma);
+
return 0;
}
return -EBUSY;
udev->inode = inode;
- kref_get(&udev->kref);
pr_debug("open\n");
clear_bit(TCMU_DEV_BIT_OPEN, &udev->flags);
pr_debug("close\n");
- /* release ref from open */
- kref_put(&udev->kref, tcmu_dev_kref_release);
+
return 0;
}