From: Yonghong Song Date: Tue, 18 Jul 2017 21:12:10 +0000 (-0700) Subject: generate proper usdt code to prevent llvm meddling with ctx->#fields X-Git-Tag: submit/tizen_4.0/20171018.110122~60^2 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=8206f547b8e3a669db49a3326affbef7012fec49;p=platform%2Fupstream%2Fbcc.git generate proper usdt code to prevent llvm meddling with ctx->#fields Qin reported a test case where llvm still messes up with ctx->#fields. For code like below: switch(ctx->ip) { case 0x7fdf2ede9820ULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->r12; return 0; case 0x7fdf2edecd9cULL: *((int64_t *)dest) = *(volatile int64_t *)&ctx->bx; return 0; } The compiler still generates: # r1 is the pointer to the ctx r1 += 24 goto LBB0_4 LBB0_3: r1 += 40 LBB0_4: r3 = *(u64 *)(r1 + 0) The verifier will reject the above code since the last load is not "ctx + field_offset" format. The responsible llvm optimization pass is CFGSimplifyPass. Its main implementation in llvm/lib/Transforms/Utils/SimplifyCFG.cpp. The main routine to do the optimization is SinkThenElseCodeToEnd. The routine canSinkInstructions is used to determine whether an insn is a candidate for sinking. Unfortunately, volatile load/store is not a condition to prevent the optimization. But inline assembly is a condition which can prevent further optimization. In this patch, instead of using volatile to annotate ctx->#field access, we do normal ctx->#field access but put a compiler inline assembly memory barrier __asm__ __volatile__(\"\": : :\"memory\"); after the field access. Tested with usdt unit test case, usdt_samples example, a couple of usdt unit tests developed in the past. Signed-off-by: Yonghong Song --- diff --git a/src/cc/usdt.h b/src/cc/usdt.h index ab14f2b1..313804f0 100644 --- a/src/cc/usdt.h +++ b/src/cc/usdt.h @@ -35,6 +35,9 @@ class ArgumentParser; static const std::string USDT_PROGRAM_HEADER = "#include \n"; +static const std::string COMPILER_BARRIER = + "__asm__ __volatile__(\"\": : :\"memory\");"; + class Argument { private: optional arg_size_; diff --git a/src/cc/usdt_args.cc b/src/cc/usdt_args.cc index ee1a3ee5..42fb360f 100644 --- a/src/cc/usdt_args.cc +++ b/src/cc/usdt_args.cc @@ -67,20 +67,27 @@ bool Argument::assign_to_local(std::ostream &stream, } if (!deref_offset_) { - tfm::format(stream, "%s = *(volatile %s *)&ctx->%s;", local_name, ctype(), - *base_register_name_); + tfm::format(stream, "%s = ctx->%s;", local_name, *base_register_name_); + // Put a compiler barrier to prevent optimization + // like llvm SimplifyCFG SinkThenElseCodeToEnd + // Volatile marking is not sufficient to prevent such optimization. + tfm::format(stream, " %s", COMPILER_BARRIER); return true; } if (deref_offset_ && !deref_ident_) { - tfm::format(stream, "{ u64 __addr = (*(volatile u64 *)&ctx->%s) + (%d)", + tfm::format(stream, "{ u64 __addr = ctx->%s + %d", *base_register_name_, *deref_offset_); if (index_register_name_) { int scale = scale_.value_or(1); - tfm::format(stream, " + ((*(volatile u64 *)&ctx->%s) * %d);", *index_register_name_, scale); + tfm::format(stream, " + (ctx->%s * %d);", *index_register_name_, scale); } else { tfm::format(stream, ";"); } + // Theoretically, llvm SimplifyCFG SinkThenElseCodeToEnd may still + // sink bpf_probe_read call, so put a barrier here to prevent sinking + // of ctx->#fields. + tfm::format(stream, " %s ", COMPILER_BARRIER); tfm::format(stream, "%s __res = 0x0; " "bpf_probe_read(&__res, sizeof(__res), (void *)__addr); "