Info: Entering OpenMP data region at zaxpy.cpp:14:1 with 2 arguments:
Info: to(X[0:N])[16384]
Info: tofrom(Y[0:N])[16384]
- Info: Creating new map entry with HstPtrBegin=0x00007ffde9e99000,
- TgtPtrBegin=0x00007f15dc600000, Size=16384, Name=X[0:N]
- Info: Copying data from host to device, HstPtr=0x00007ffde9e99000,
- TgtPtr=0x00007f15dc600000, Size=16384, Name=X[0:N]
- Info: Creating new map entry with HstPtrBegin=0x00007ffde9e95000,
- TgtPtrBegin=0x00007f15dc604000, Size=16384, Name=Y[0:N]
- Info: Copying data from host to device, HstPtr=0x00007ffde9e95000,
- TgtPtr=0x00007f15dc604000, Size=16384, Name=Y[0:N]
+ Info: Creating new map entry with HstPtrBegin=0x00007fff0d259a40,
+ TgtPtrBegin=0x00007fdba5800000, Size=16384, RefCount=1, Name=X[0:N]
+ Info: Copying data from host to device, HstPtr=0x00007fff0d259a40,
+ TgtPtr=0x00007fdba5800000, Size=16384, Name=X[0:N]
+ Info: Creating new map entry with HstPtrBegin=0x00007fff0d255a40,
+ TgtPtrBegin=0x00007fdba5804000, Size=16384, RefCount=1, Name=Y[0:N]
+ Info: Copying data from host to device, HstPtr=0x00007fff0d255a40,
+ TgtPtr=0x00007fdba5804000, Size=16384, Name=Y[0:N]
Info: OpenMP Host-Device pointer mappings after block at zaxpy.cpp:14:1:
Info: Host Ptr Target Ptr Size (B) RefCount Declaration
- Info: 0x00007ffde9e95000 0x00007f15dc604000 16384 1 Y[0:N] at zaxpy.cpp:13:17
- Info: 0x00007ffde9e99000 0x00007f15dc600000 16384 1 X[0:N] at zaxpy.cpp:13:11
+ Info: 0x00007fff0d255a40 0x00007fdba5804000 16384 1 Y[0:N] at zaxpy.cpp:13:17
+ Info: 0x00007fff0d259a40 0x00007fdba5800000 16384 1 X[0:N] at zaxpy.cpp:13:11
Info: Entering OpenMP kernel at zaxpy.cpp:6:1 with 4 arguments:
Info: firstprivate(N)[8] (implicit)
Info: use_address(Y)[0] (implicit)
Info: tofrom(D)[16] (implicit)
Info: use_address(X)[0] (implicit)
- Info: Mapping exists (implicit) with HstPtrBegin=0x00007ffde9e95000,
- TgtPtrBegin=0x00007f15dc604000, Size=0, updated RefCount=2, Name=Y
- Info: Creating new map entry with HstPtrBegin=0x00007ffde9e94fb0,
- TgtPtrBegin=0x00007f15dc608000, Size=16, Name=D
- Info: Copying data from host to device, HstPtr=0x00007ffde9e94fb0,
- TgtPtr=0x00007f15dc608000, Size=16, Name=D
- Info: Mapping exists (implicit) with HstPtrBegin=0x00007ffde9e99000,
- TgtPtrBegin=0x00007f15dc600000, Size=0, updated RefCount=2, Name=X
- Info: Launching kernel __omp_offloading_fd02_e25f6e76__Z5zaxpyPSt7complexIdES1_S0_m_l6
+ Info: Mapping exists (implicit) with HstPtrBegin=0x00007fff0d255a40,
+ TgtPtrBegin=0x00007fdba5804000, Size=0, RefCount=2 (incremented), Name=Y
+ Info: Creating new map entry with HstPtrBegin=0x00007fff0d2559f0,
+ TgtPtrBegin=0x00007fdba5808000, Size=16, RefCount=1, Name=D
+ Info: Copying data from host to device, HstPtr=0x00007fff0d2559f0,
+ TgtPtr=0x00007fdba5808000, Size=16, Name=D
+ Info: Mapping exists (implicit) with HstPtrBegin=0x00007fff0d259a40,
+ TgtPtrBegin=0x00007fdba5800000, Size=0, RefCount=2 (incremented), Name=X
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d255a40,
+ TgtPtrBegin=0x00007fdba5804000, Size=0, RefCount=2 (update suppressed)
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d2559f0,
+ TgtPtrBegin=0x00007fdba5808000, Size=16, RefCount=1 (update suppressed)
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d259a40,
+ TgtPtrBegin=0x00007fdba5800000, Size=0, RefCount=2 (update suppressed)
+ Info: Launching kernel __omp_offloading_10305_c08c86__Z5zaxpyPSt7complexIdES1_S0_m_l6
with 8 blocks and 128 threads in SPMD mode
- Info: Copying data from device to host, TgtPtr=0x00007f15dc608000,
- HstPtr=0x00007ffde9e94fb0, Size=16, Name=D
- Info: Removing map entry with HstPtrBegin=0x00007ffde9e94fb0,
- TgtPtrBegin=0x00007f15dc608000, Size=16, Name=D
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d259a40,
+ TgtPtrBegin=0x00007fdba5800000, Size=0, RefCount=1 (decremented)
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d2559f0,
+ TgtPtrBegin=0x00007fdba5808000, Size=16, RefCount=1 (deferred final decrement)
+ Info: Copying data from device to host, TgtPtr=0x00007fdba5808000,
+ HstPtr=0x00007fff0d2559f0, Size=16, Name=D
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d255a40,
+ TgtPtrBegin=0x00007fdba5804000, Size=0, RefCount=1 (decremented)
+ Info: Removing map entry with HstPtrBegin=0x00007fff0d2559f0,
+ TgtPtrBegin=0x00007fdba5808000, Size=16, Name=D
Info: OpenMP Host-Device pointer mappings after block at zaxpy.cpp:6:1:
Info: Host Ptr Target Ptr Size (B) RefCount Declaration
- Info: 0x00007ffde9e95000 0x00007f15dc604000 16384 1 Y[0:N] at zaxpy.cpp:13:17
- Info: 0x00007ffde9e99000 0x00007f15dc600000 16384 1 X[0:N] at zaxpy.cpp:13:11
+ Info: 0x00007fff0d255a40 0x00007fdba5804000 16384 1 Y[0:N] at zaxpy.cpp:13:17
+ Info: 0x00007fff0d259a40 0x00007fdba5800000 16384 1 X[0:N] at zaxpy.cpp:13:11
Info: Exiting OpenMP data region at zaxpy.cpp:14:1 with 2 arguments:
Info: to(X[0:N])[16384]
Info: tofrom(Y[0:N])[16384]
- Info: Copying data from device to host, TgtPtr=0x00007f15dc604000,
- HstPtr=0x00007ffde9e95000, Size=16384, Name=Y[0:N]
- Info: Removing map entry with HstPtrBegin=0x00007ffde9e95000,
- TgtPtrBegin=0x00007f15dc604000, Size=16384, Name=Y[0:N]
- Info: Removing map entry with HstPtrBegin=0x00007ffde9e99000,
- TgtPtrBegin=0x00007f15dc600000, Size=16384, Name=X[0:N]
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d255a40,
+ TgtPtrBegin=0x00007fdba5804000, Size=16384, RefCount=1 (deferred final decrement)
+ Info: Copying data from device to host, TgtPtr=0x00007fdba5804000,
+ HstPtr=0x00007fff0d255a40, Size=16384, Name=Y[0:N]
+ Info: Mapping exists with HstPtrBegin=0x00007fff0d259a40,
+ TgtPtrBegin=0x00007fdba5800000, Size=16384, RefCount=1 (deferred final decrement)
+ Info: Removing map entry with HstPtrBegin=0x00007fff0d255a40,
+ TgtPtrBegin=0x00007fdba5804000, Size=16384, Name=Y[0:N]
+ Info: Removing map entry with HstPtrBegin=0x00007fff0d259a40,
+ TgtPtrBegin=0x00007fdba5800000, Size=16384, Name=X[0:N]
From this information, we can see the OpenMP kernel being launched on the CUDA
device with enough threads and blocks for all ``1024`` iterations of the loop in
}
// Mapping does not exist, allocate it with refCount=INF
- auto Res = HostDataToTargetMap.emplace(
- (uintptr_t)HstPtrBegin /*HstPtrBase*/,
- (uintptr_t)HstPtrBegin /*HstPtrBegin*/,
- (uintptr_t)HstPtrBegin + Size /*HstPtrEnd*/,
- (uintptr_t)TgtPtrBegin /*TgtPtrBegin*/, nullptr, true /*IsRefCountINF*/);
- auto NewEntry = Res.first;
+ const HostDataToTargetTy &newEntry =
+ *HostDataToTargetMap
+ .emplace(
+ /*HstPtrBase=*/(uintptr_t)HstPtrBegin,
+ /*HstPtrBegin=*/(uintptr_t)HstPtrBegin,
+ /*HstPtrEnd=*/(uintptr_t)HstPtrBegin + Size,
+ /*TgtPtrBegin=*/(uintptr_t)TgtPtrBegin, /*Name=*/nullptr,
+ /*IsRefCountINF=*/true)
+ .first;
DP("Creating new map entry: HstBase=" DPxMOD ", HstBegin=" DPxMOD
- ", HstEnd=" DPxMOD ", TgtBegin=" DPxMOD "\n",
- DPxPTR(NewEntry->HstPtrBase), DPxPTR(NewEntry->HstPtrBegin),
- DPxPTR(NewEntry->HstPtrEnd), DPxPTR(NewEntry->TgtPtrBegin));
+ ", HstEnd=" DPxMOD ", TgtBegin=" DPxMOD ", RefCount=%s\n",
+ DPxPTR(newEntry.HstPtrBase), DPxPTR(newEntry.HstPtrBegin),
+ DPxPTR(newEntry.HstPtrEnd), DPxPTR(newEntry.TgtPtrBegin),
+ newEntry.refCountToStr().c_str());
DataMapMtx.unlock();
((lr.Flags.ExtendsBefore || lr.Flags.ExtendsAfter) && IsImplicit)) {
auto &HT = *lr.Entry;
IsNew = false;
-
if (UpdateRefCount)
HT.incRefCount();
-
uintptr_t tp = HT.TgtPtrBegin + ((uintptr_t)HstPtrBegin - HT.HstPtrBegin);
INFO(OMP_INFOTYPE_MAPPING_EXISTS, DeviceID,
"Mapping exists%s with HstPtrBegin=" DPxMOD ", TgtPtrBegin=" DPxMOD
", "
- "Size=%" PRId64 ",%s RefCount=%s, Name=%s\n",
+ "Size=%" PRId64 ", RefCount=%s (%s), Name=%s\n",
(IsImplicit ? " (implicit)" : ""), DPxPTR(HstPtrBegin), DPxPTR(tp),
- Size, (UpdateRefCount ? " updated" : ""),
- HT.isRefCountInf() ? "INF" : std::to_string(HT.getRefCount()).c_str(),
+ Size, HT.refCountToStr().c_str(),
+ UpdateRefCount ? "incremented" : "update suppressed",
(HstPtrName) ? getNameFromMapping(HstPtrName).c_str() : "unknown");
rc = (void *)tp;
} else if ((lr.Flags.ExtendsBefore || lr.Flags.ExtendsAfter) && !IsImplicit) {
// In addition to the mapping rules above, the close map modifier forces the
// mapping of the variable to the device.
if (Size) {
- DP("Return HstPtrBegin " DPxMOD " Size=%" PRId64 " RefCount=%s\n",
- DPxPTR((uintptr_t)HstPtrBegin), Size,
- (UpdateRefCount ? " updated" : ""));
+ DP("Return HstPtrBegin " DPxMOD " Size=%" PRId64 " for unified shared "
+ "memory\n",
+ DPxPTR((uintptr_t)HstPtrBegin), Size);
IsHostPtr = true;
rc = HstPtrBegin;
}
// If it is not contained and Size > 0, we should create a new entry for it.
IsNew = true;
uintptr_t tp = (uintptr_t)allocData(Size, HstPtrBegin);
+ const HostDataToTargetTy &newEntry =
+ *HostDataToTargetMap
+ .emplace((uintptr_t)HstPtrBase, (uintptr_t)HstPtrBegin,
+ (uintptr_t)HstPtrBegin + Size, tp, HstPtrName)
+ .first;
INFO(OMP_INFOTYPE_MAPPING_CHANGED, DeviceID,
"Creating new map entry with "
- "HstPtrBegin=" DPxMOD ", TgtPtrBegin=" DPxMOD ", Size=%ld, Name=%s\n",
+ "HstPtrBegin=" DPxMOD ", TgtPtrBegin=" DPxMOD ", Size=%ld, "
+ "RefCount=%s, Name=%s\n",
DPxPTR(HstPtrBegin), DPxPTR(tp), Size,
+ newEntry.refCountToStr().c_str(),
(HstPtrName) ? getNameFromMapping(HstPtrName).c_str() : "unknown");
- HostDataToTargetMap.emplace((uintptr_t)HstPtrBase, (uintptr_t)HstPtrBegin,
- (uintptr_t)HstPtrBegin + Size, tp, HstPtrName);
rc = (void *)tp;
}
if (lr.Flags.IsContained ||
(!MustContain && (lr.Flags.ExtendsBefore || lr.Flags.ExtendsAfter))) {
auto &HT = *lr.Entry;
+ // We do not decrement the reference count to zero here. deallocTgtPtr does
+ // that atomically with removing the mapping. Otherwise, before this thread
+ // removed the mapping in deallocTgtPtr, another thread could retrieve the
+ // mapping, increment and decrement back to zero, and then both threads
+ // would try to remove the mapping, resulting in a double free.
IsLast = HT.getRefCount() == 1;
-
- if (!IsLast && UpdateRefCount)
+ const char *RefCountAction;
+ if (!UpdateRefCount)
+ RefCountAction = "update suppressed";
+ else if (IsLast)
+ RefCountAction = "deferred final decrement";
+ else {
+ RefCountAction = "decremented";
HT.decRefCount();
-
+ }
uintptr_t tp = HT.TgtPtrBegin + ((uintptr_t)HstPtrBegin - HT.HstPtrBegin);
- DP("Mapping exists with HstPtrBegin=" DPxMOD ", TgtPtrBegin=" DPxMOD ", "
- "Size=%" PRId64 ",%s RefCount=%s\n",
- DPxPTR(HstPtrBegin), DPxPTR(tp), Size,
- (UpdateRefCount ? " updated" : ""),
- HT.isRefCountInf() ? "INF" : std::to_string(HT.getRefCount()).c_str());
+ INFO(OMP_INFOTYPE_MAPPING_EXISTS, DeviceID,
+ "Mapping exists with HstPtrBegin=" DPxMOD ", TgtPtrBegin=" DPxMOD ", "
+ "Size=%" PRId64 ", RefCount=%s (%s)\n",
+ DPxPTR(HstPtrBegin), DPxPTR(tp), Size, HT.refCountToStr().c_str(),
+ RefCountAction);
rc = (void *)tp;
} else if (PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) {
// If the value isn't found in the mapping and unified shared memory
// is on then it means we have stumbled upon a value which we need to
// use directly from the host.
- DP("Get HstPtrBegin " DPxMOD " Size=%" PRId64 " RefCount=%s\n",
- DPxPTR((uintptr_t)HstPtrBegin), Size,
- (UpdateRefCount ? " updated" : ""));
+ DP("Get HstPtrBegin " DPxMOD " Size=%" PRId64 " for unified shared "
+ "memory\n",
+ DPxPTR((uintptr_t)HstPtrBegin), Size);
IsHostPtr = true;
rc = HstPtrBegin;
}
}
bool isRefCountInf() const { return RefCount == INFRefCount; }
+
+ std::string refCountToStr() const {
+ return isRefCountInf() ? "INF" : std::to_string(getRefCount());
+ }
};
typedef uintptr_t HstPtrBeginTy;
for (const auto &HostTargetMap : Device.HostDataToTargetMap) {
SourceInfo Info(HostTargetMap.HstPtrName);
INFO(OMP_INFOTYPE_ALL, Device.DeviceID,
- DPxMOD " " DPxMOD " %-8" PRIuPTR " %-8" PRId64 " %s at %s:%d:%d\n",
+ DPxMOD " " DPxMOD " %-8" PRIuPTR " %-8s %s at %s:%d:%d\n",
DPxPTR(HostTargetMap.HstPtrBegin), DPxPTR(HostTargetMap.TgtPtrBegin),
HostTargetMap.HstPtrEnd - HostTargetMap.HstPtrBegin,
- HostTargetMap.getRefCount(), Info.getName(), Info.getFilename(),
- Info.getLine(), Info.getColumn());
+ HostTargetMap.refCountToStr().c_str(), Info.getName(),
+ Info.getFilename(), Info.getLine(), Info.getColumn());
}
Device.DataMapMtx.unlock();
}
#define N 64
+#pragma omp declare target
+int global;
+#pragma omp end declare target
+
extern void __tgt_set_info_flag(unsigned);
int main() {
// INFO: Libomptarget device 0 info: alloc(A[0:64])[256]
// INFO: Libomptarget device 0 info: tofrom(B[0:64])[256]
// INFO: Libomptarget device 0 info: to(C[0:64])[256]
-// INFO: Libomptarget device 0 info: Creating new map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, Name=A[0:64]
-// INFO: Libomptarget device 0 info: Creating new map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, Name=B[0:64]
+// INFO: Libomptarget device 0 info: Creating new map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, RefCount=1, Name=A[0:64]
+// INFO: Libomptarget device 0 info: Creating new map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, RefCount=1, Name=B[0:64]
// INFO: Libomptarget device 0 info: Copying data from host to device, HstPtr={{.*}}, TgtPtr={{.*}}, Size=256, Name=B[0:64]
-// INFO: Libomptarget device 0 info: Creating new map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, Name=C[0:64]
+// INFO: Libomptarget device 0 info: Creating new map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, RefCount=1, Name=C[0:64]
// INFO: Libomptarget device 0 info: Copying data from host to device, HstPtr={{.*}}, TgtPtr={{.*}}, Size=256, Name=C[0:64]
// INFO: Libomptarget device 0 info: OpenMP Host-Device pointer mappings after block at info.c:{{[0-9]+}}:{{[0-9]+}}:
// INFO: Libomptarget device 0 info: Host Ptr Target Ptr Size (B) RefCount Declaration
// INFO: Libomptarget device 0 info: Removing map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, Name=C[0:64]
// INFO: Libomptarget device 0 info: Removing map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, Name=B[0:64]
// INFO: Libomptarget device 0 info: Removing map entry with HstPtrBegin={{.*}}, TgtPtrBegin={{.*}}, Size=256, Name=A[0:64]
+// INFO: Libomptarget device 0 info: OpenMP Host-Device pointer mappings after block at info.c:[[#%u,]]:[[#%u,]]:
+// INFO: Libomptarget device 0 info: Host Ptr Target Ptr Size (B) RefCount Declaration
+// INFO: Libomptarget device 0 info: [[#%#x,]] [[#%#x,]] 4 INF unknown at unknown:0:0
#pragma omp target data map(alloc:A[0:N]) map(tofrom:B[0:N]) map(to:C[0:N])
#pragma omp target firstprivate(val)
{ val = 1; }