This failing signature:
[ 8.392669] cxl_bus_probe: cxl_port endpoint2: probe:
970997760
[ 8.392670] cxl_port: probe of endpoint2 failed with error
970997760
[ 8.392719] create_endpoint: cxl_mem mem0: add: endpoint2
[ 8.392721] cxl_mem mem0: endpoint2 failed probe
[ 8.392725] cxl_bus_probe: cxl_mem mem0: probe: -6
...shows cxl_hdm_decode_init() resulting in a return code ("
970997760")
that looks like stack corruption. The problem goes away if
cxl_hdm_decode_init() is not mocked via __wrap_cxl_hdm_decode_init().
The corruption results from the mismatch that the calling convention for
cxl_hdm_decode_init() is:
int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm)
...and __wrap_cxl_hdm_decode_init() is:
bool __wrap_cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm)
...i.e. an int is expected but __wrap_hdm_decode_init() returns bool.
Fix the convention and cleanup the organization to match
__wrap_cxl_await_media_ready() as the difference was a red herring that
distracted from finding the bug.
Fixes: 92804edb11f0 ("cxl/pci: Drop @info argument to cxl_hdm_decode_init()")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Link: https://lore.kernel.org/r/165603870776.551046.8709990108936497723.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
}
EXPORT_SYMBOL_NS_GPL(__wrap_cxl_await_media_ready, CXL);
-bool __wrap_cxl_hdm_decode_init(struct cxl_dev_state *cxlds,
- struct cxl_hdm *cxlhdm)
+int __wrap_cxl_hdm_decode_init(struct cxl_dev_state *cxlds,
+ struct cxl_hdm *cxlhdm)
{
int rc = 0, index;
struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
- if (!ops || !ops->is_mock_dev(cxlds->dev))
+ if (ops && ops->is_mock_dev(cxlds->dev))
+ rc = 0;
+ else
rc = cxl_hdm_decode_init(cxlds, cxlhdm);
put_cxl_mock_ops(index);