cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime
authorDan Williams <dan.j.williams@intel.com>
Wed, 27 Jul 2022 22:16:46 +0000 (15:16 -0700)
committerDan Williams <dan.j.williams@intel.com>
Mon, 1 Aug 2022 22:36:33 +0000 (15:36 -0700)
commit4d5c42a80bd17d1979dbcd40c5c44ff3c93e1476
tree129032021aae2442aacac13c6e91dbb847147f1e
parente77483055c325fa629c5835913b07f3dce3ac7fd
cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime

After adding support for emulating platform firmware established DPA
reservations, the cxl-topology.sh [1] unit test started crashing with
the following signature:

 general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP
 [..]
 RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
 [..]
 Call Trace:
  <TASK>
  __cxl_dpa_release+0x1b/0xd0 [cxl_core]
  cxl_dpa_release+0x1d/0x30 [cxl_core]
  release_nodes+0x63/0x90
  devres_release_all+0x88/0xc0

...i.e. a use after free of a 'struct cxl_endpoint_decoder' object. This
results from the ordering of init_hdm_decoder() before add_hdm_decoder()
where, at release time, the decoder is unregistered and released before
the DPA reservation.

Fix this by extending the life of the object until all DPA reservations
have been released which also preserves platform decoder settings being
settled by the time the decoder is published in sysfs (KOBJ_ADD time).

Note that the @len == 0 case in __cxl_dpa_reserve() is avoided in
practice as this function is only called for committed decoders and new
non-zero DPA allocations.

Link: https://github.com/pmem/ndctl/blob/pending/test/cxl-topology.sh
Fixes: 9c57cde0dcbd ("cxl/hdm: Enumerate allocated DPA")
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/165896020625.3546860.12390103413706292760.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
drivers/cxl/core/hdm.c