mm, memfd: fix COW issue on MAP_PRIVATE and F_SEAL_FUTURE_WRITE mappings
authorNicolas Geoffray <ngeoffray@google.com>
Sun, 1 Dec 2019 01:53:28 +0000 (17:53 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Sun, 1 Dec 2019 20:59:03 +0000 (12:59 -0800)
F_SEAL_FUTURE_WRITE has unexpected behavior when used with MAP_PRIVATE:
A private mapping created after the memfd file that gets sealed with
F_SEAL_FUTURE_WRITE loses the copy-on-write at fork behavior, meaning
children and parent share the same memory, even though the mapping is
private.

The reason for this is due to the code below:

  static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
  {
        struct shmem_inode_info *info = SHMEM_I(file_inode(file));

        if (info->seals & F_SEAL_FUTURE_WRITE) {
                /*
                 * New PROT_WRITE and MAP_SHARED mmaps are not allowed when
                 * "future write" seal active.
                 */
                if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE))
                        return -EPERM;

                /*
                 * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED
                 * read-only mapping, take care to not allow mprotect to revert
                 * protections.
                 */
                vma->vm_flags &= ~(VM_MAYWRITE);
        }
        ...
  }

And for the mm to know if a mapping is copy-on-write:

  static inline bool is_cow_mapping(vm_flags_t flags)
  {
        return (flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
  }

The patch fixes the issue by making the mprotect revert protection
happen only for shared mappings.  For private mappings, using mprotect
will have no effect on the seal behavior.

The F_SEAL_FUTURE_WRITE feature was introduced in v5.1 so v5.3.x stable
kernels would need a backport.

[akpm@linux-foundation.org: reflow comment, per Christoph]
Link: http://lkml.kernel.org/r/20191107195355.80608-1-joel@joelfernandes.org
Fixes: ab3948f58ff84 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd")
Signed-off-by: Nicolas Geoffray <ngeoffray@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/shmem.c

index 9ec9dd1946d62801fb2fc029eb132fa5346686bd..60de3d9e26a71ecb6740733a988b78ce0882896f 100644 (file)
@@ -2214,11 +2214,14 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
                        return -EPERM;
 
                /*
-                * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED
-                * read-only mapping, take care to not allow mprotect to revert
-                * protections.
+                * Since an F_SEAL_FUTURE_WRITE sealed memfd can be mapped as
+                * MAP_SHARED and read-only, take care to not allow mprotect to
+                * revert protections on such mappings. Do this only for shared
+                * mappings. For private mappings, don't need to mask
+                * VM_MAYWRITE as we still want them to be COW-writable.
                 */
-               vma->vm_flags &= ~(VM_MAYWRITE);
+               if (vma->vm_flags & VM_SHARED)
+                       vma->vm_flags &= ~(VM_MAYWRITE);
        }
 
        file_accessed(file);