[analyzer] Prefer wrapping SymbolicRegions by ElementRegions
authorBalazs Benics <benicsbalazs@gmail.com>
Tue, 13 Sep 2022 06:58:46 +0000 (08:58 +0200)
committerBalazs Benics <benicsbalazs@gmail.com>
Tue, 13 Sep 2022 06:58:46 +0000 (08:58 +0200)
commitf8643a9b31c4029942f67d4534c9139b45173504
tree253381d4be4901bc41c063657c98f90fd8ed52c9
parent7ed68182d741cf78e7cf87399d8ae6de73bf723c
[analyzer] Prefer wrapping SymbolicRegions by ElementRegions

It turns out that in certain cases `SymbolRegions` are wrapped by
`ElementRegions`; in others, it's not. This discrepancy can cause the
analyzer not to recognize if the two regions are actually referring to
the same entity, which then can lead to unreachable paths discovered.

Consider this example:

```lang=C++
struct Node { int* ptr; };
void with_structs(Node* n1) {
  Node c = *n1; // copy
  Node* n2 = &c;
  clang_analyzer_dump(*n1); // lazy...
  clang_analyzer_dump(*n2); // lazy...
  clang_analyzer_dump(n1->ptr); // rval(n1->ptr): reg_$2<int * SymRegion{reg_$0<struct Node * n1>}.ptr>
  clang_analyzer_dump(n2->ptr); // rval(n2->ptr): reg_$1<int * Element{SymRegion{reg_$0<struct Node * n1>},0 S64b,struct Node}.ptr>
  clang_analyzer_eval(n1->ptr != n2->ptr); // UNKNOWN, bad!
  (void)(*n1);
  (void)(*n2);
}
```

The copy of `n1` will insert a new binding to the store; but for doing
that it actually must create a `TypedValueRegion` which it could pass to
the `LazyCompoundVal`. Since the memregion in question is a
`SymbolicRegion` - which is untyped, it needs to first wrap it into an
`ElementRegion` basically implementing this untyped -> typed conversion
for the sake of passing it to the `LazyCompoundVal`.
So, this is why we have `Element{SymRegion{.}, 0,struct Node}` for `n1`.

The problem appears if the analyzer evaluates a read from the expression
`n1->ptr`. The same logic won't apply for `SymbolRegionValues`, since
they accept raw `SubRegions`, hence the `SymbolicRegion` won't be
wrapped into an `ElementRegion` in that case.

Later when we arrive at the equality comparison, we cannot prove that
they are equal.

For more details check the corresponding thread on discourse:
https://discourse.llvm.org/t/are-symbolicregions-really-untyped/64406

---

In this patch, I'm eagerly wrapping each `SymbolicRegion` by an
`ElementRegion`; basically canonicalizing to this form.
It seems reasonable to do so since any object can be thought of as a single
array of that object; so this should not make much of a difference.

The tests also underpin this assumption, as only a few were broken by
this change; and actually fixed a FIXME along the way.

About the second example, which does the same copy operation - but on
the heap - it will be fixed by the next patch.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D132142
14 files changed:
clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
clang/lib/StaticAnalyzer/Checkers/ExprInspectionChecker.cpp
clang/lib/StaticAnalyzer/Checkers/NullabilityChecker.cpp
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
clang/lib/StaticAnalyzer/Core/MemRegion.cpp
clang/lib/StaticAnalyzer/Core/RegionStore.cpp
clang/test/Analysis/ctor.mm
clang/test/Analysis/expr-inspection.c
clang/test/Analysis/memory-model.cpp
clang/test/Analysis/ptr-arith.c
clang/test/Analysis/ptr-arith.cpp
clang/test/Analysis/trivial-copy-struct.cpp [new file with mode: 0644]