From f047c9aa5f876228688ab7fde13de55ab97d0e64 Mon Sep 17 00:00:00 2001 From: Peter Sollich Date: Tue, 10 May 2022 18:13:43 +0200 Subject: [PATCH] Fix heap corruption issue 68443. (#69106) This has first seen with regions, but should be an issue with segments as well. What happened was that in revisit_written_pages, we determined the highest allocated address in a region, then obtained the dirty pages up to that high address. In parallel, another thread allocated a new object after the high address and wrote to it. The background GC thread saw the dirty page, but didn't explore the object because it was after its stale value for the high address. That caused some objects pointed at from the beginning of the new object to be seen as dead by background GC. Because the allocating thread also set the card table entries, the next ephemeral GC crashed because the references from the new object were now invalid. The fix is to refetch the high address after we have obtained the dirty pages. That way, we'll explore new objects allocated while we were getting the dirty pages. New objects allocated and written to after we obtained the dirty pages will cause more dirty pages and will thus be explored later. My repro case that caused about a crash every two hours or so has run overnight with this fix without issues. --- src/coreclr/gc/gc.cpp | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/coreclr/gc/gc.cpp b/src/coreclr/gc/gc.cpp index 1ef2365..6bbe2f5 100644 --- a/src/coreclr/gc/gc.cpp +++ b/src/coreclr/gc/gc.cpp @@ -33547,6 +33547,11 @@ void gc_heap::revisit_written_pages (BOOL concurrent_p, BOOL reset_only_p) if (!reset_only_p) { + // refetch the high address in case it has changed while we fetched dirty pages + // this is only an issue for the page high_address is on - we may have new + // objects after high_address. + high_address = high_page (seg, concurrent_p); + for (unsigned i = 0; i < bcount; i++) { uint8_t* page = (uint8_t*)background_written_addresses[i]; -- 2.7.4