[Pytorch Profiler] Move start timestamp to end of start callback (#62191)
authorKimish Patel <kimishpatel@fb.com>
Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>
Sat, 14 Aug 2021 04:40:12 +0000 (21:40 -0700)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62191

This moves start timestamping to end of callback. This way we dont
account for callstack/module hierarchy related overhead in op runtime.

Test Plan:
CI

Imported from OSS

Reviewed By: ilia-cher

Differential Revision: D29910519

fbshipit-source-id: f462031a81ae12b3db7993cf482e5ad93a35e096

torch/csrc/autograd/profiler_kineto.cpp

index 3b5b511..9995237 100644 (file)
@@ -304,7 +304,6 @@ void pushProfilingCallbacks() {
 #endif // USE_KINETO
 
           auto ctx_ptr = std::make_unique<KinetoObserverContext>();
-          ctx_ptr->startUs = getTimeUs();
           ctx_ptr->correlationId = corr_id;
           ctx_ptr->startThreadId = at::RecordFunction::currentThreadId();
 
@@ -337,6 +336,7 @@ void pushProfilingCallbacks() {
             ctx_ptr->module_hierarchy = jit::currentModuleHierarchy();
           }
   #endif
+          ctx_ptr->startUs = getTimeUs();
           if (config.state == ProfilerState::KINETO_GPU_FALLBACK) {
             try {
               cudaStubs()->record(nullptr, &ctx_ptr->cuda_event_start_, nullptr);