mm: zswap: fix potential memory corruption on duplicate store
authorDomenico Cerasuolo <cerasuolodomenico@gmail.com>
Fri, 22 Sep 2023 17:22:11 +0000 (19:22 +0200)
committerAndrew Morton <akpm@linux-foundation.org>
Sat, 30 Sep 2023 00:20:47 +0000 (17:20 -0700)
While stress-testing zswap a memory corruption was happening when writing
back pages.  __frontswap_store used to check for duplicate entries before
attempting to store a page in zswap, this was because if the store fails
the old entry isn't removed from the tree.  This change removes duplicate
entries in zswap_store before the actual attempt.

[cerasuolodomenico@gmail.com: add a warning and a comment, per Johannes]
Link: https://lkml.kernel.org/r/20230925130002.1929369-1-cerasuolodomenico@gmail.com
Link: https://lkml.kernel.org/r/20230922172211.1704917-1-cerasuolodomenico@gmail.com
Fixes: 42c06a0e8ebe ("mm: kill frontswap")
Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/zswap.c

index 412b140..083c693 100644 (file)
@@ -1219,6 +1219,19 @@ bool zswap_store(struct folio *folio)
                return false;
 
        /*
+        * If this is a duplicate, it must be removed before attempting to store
+        * it, otherwise, if the store fails the old page won't be removed from
+        * the tree, and it might be written back overriding the new data.
+        */
+       spin_lock(&tree->lock);
+       dupentry = zswap_rb_search(&tree->rbroot, offset);
+       if (dupentry) {
+               zswap_duplicate_entry++;
+               zswap_invalidate_entry(tree, dupentry);
+       }
+       spin_unlock(&tree->lock);
+
+       /*
         * XXX: zswap reclaim does not work with cgroups yet. Without a
         * cgroup-aware entry LRU, we will push out entries system-wide based on
         * local cgroup limits.
@@ -1333,7 +1346,14 @@ insert_entry:
 
        /* map */
        spin_lock(&tree->lock);
+       /*
+        * A duplicate entry should have been removed at the beginning of this
+        * function. Since the swap entry should be pinned, if a duplicate is
+        * found again here it means that something went wrong in the swap
+        * cache.
+        */
        while (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
+               WARN_ON(1);
                zswap_duplicate_entry++;
                zswap_invalidate_entry(tree, dupentry);
        }