Optimize clear_brick_table. (#65837)
authorPeter Sollich <petersol@microsoft.com>
Tue, 1 Mar 2022 16:34:00 +0000 (17:34 +0100)
committerGitHub <noreply@github.com>
Tue, 1 Mar 2022 16:34:00 +0000 (17:34 +0100)
I observed that the inlined call to clear_brick_table in clear_region_info took more CPU samples than necessary - it's about 7x faster to call memset than it is to code a straighforward loop.

src/coreclr/gc/gc.cpp

index df9ba55..5df3979 100644 (file)
@@ -8034,8 +8034,9 @@ uint8_t* gc_heap::brick_address (size_t brick)
 
 void gc_heap::clear_brick_table (uint8_t* from, uint8_t* end)
 {
-    for (size_t i = brick_of (from);i < brick_of (end); i++)
-        brick_table[i] = 0;
+    size_t from_brick = brick_of (from);
+    size_t end_brick = brick_of (end);
+    memset (&brick_table[from_brick], 0, sizeof(brick_table[from_brick])*(end_brick-from_brick));
 }
 
 //codes for the brick entries: