I am trying to find instructions in LLVM Pass by row and column number (reported by a third-party tool) to measure them. To do this, I compile the source files using clang -g -O0 -emit-llvm and look for information in the metadata using this code:
const DebugLoc &location = instruction->getDebugLoc();
Unfortunately, this information is completely inaccurate. Consider the following implementation of the Fibonacci function:
unsigned fib(unsigned n) { if (n < 2) return n; unsigned f = fib(n - 1) + fib(n - 2); return f; }
I would like to find the only LLVM statement matching the unsigned f = ... assignment in the resulting LLVM IR. I'm not interested in all the calculations on the right side. The generated LLVM block, including the corresponding debug metadata, is:
[...] if.end: ; preds = %entry call void @llvm.dbg.declare(metadata !{i32* %f}, metadata !17), !dbg !18 %2 = load i32* %n.addr, align 4, !dbg !19 %sub = sub i32 %2, 1, !dbg !19 %call = call i32 @fib(i32 %sub), !dbg !19 %3 = load i32* %n.addr, align 4, !dbg !20 %sub1 = sub i32 %3, 2, !dbg !20 %call2 = call i32 @fib(i32 %sub1), !dbg !20 %add = add i32 %call, %call2, !dbg !20 store i32 %add, i32* %f, align 4, !dbg !20 %4 = load i32* %f, align 4, !dbg !21 store i32 %4, i32* %retval, !dbg !21 br label %return, !dbg !21 [...] !17 = metadata !{i32 786688, metadata !4, metadata !"f", metadata !5, i32 5, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [f] [line 5] !18 = metadata !{i32 5, i32 11, metadata !4, null} !19 = metadata !{i32 5, i32 15, metadata !4, null} !20 = metadata !{i32 5, i32 28, metadata !4, null} !21 = metadata !{i32 6, i32 2, metadata !4, null} !22 = metadata !{i32 7, i32 1, metadata !4, null}
As you can see, the metadata !dbg !20 the store statement points to the column of row 5 , which is a call to fib(n - 2) . Worse, the addition and subtraction of n - 2 also indicate a function call identified by !dbg !20 .
Interestingly, the Clang AST emitted by clang -Xclang -ast-dump -fsyntax-only has all this information. Thus, I suspect that he was somehow lost at the stage of code generation. It seems that during code generation, Clang reaches some internal point of the sequence and associates all of the following instructions with this position until the next point of the sequence appears (for example, a function call). For completeness, here is the announcement instruction in AST:
|-DeclStmt 0x7ffec3869f48 <line:5:2, col:38> | `-VarDecl 0x7ffec382d680 <col:2, col:37> col:11 used f 'unsigned int' cinit | `-BinaryOperator 0x7ffec3869f20 <col:15, col:37> 'unsigned int' '+' | |-CallExpr 0x7ffec382d7e0 <col:15, col:24> 'unsigned int' | | |-ImplicitCastExpr 0x7ffec382d7c8 <col:15> 'unsigned int (*)(unsigned int)' <FunctionToPointerDecay> | | | `-DeclRefExpr 0x7ffec382d6d8 <col:15> 'unsigned int (unsigned int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned int)' | | `-BinaryOperator 0x7ffec382d778 <col:19, col:23> 'unsigned int' '-' | | |-ImplicitCastExpr 0x7ffec382d748 <col:19> 'unsigned int' <LValueToRValue> | | | `-DeclRefExpr 0x7ffec382d700 <col:19> 'unsigned int' lvalue ParmVar 0x7ffec382d3d0 'n' 'unsigned int' | | `-ImplicitCastExpr 0x7ffec382d760 <col:23> 'unsigned int' <IntegralCast> | | `-IntegerLiteral 0x7ffec382d728 <col:23> 'int' 1 | `-CallExpr 0x7ffec3869ef0 <col:28, col:37> 'unsigned int' | |-ImplicitCastExpr 0x7ffec3869ed8 <col:28> 'unsigned int (*)(unsigned int)' <FunctionToPointerDecay> | | `-DeclRefExpr 0x7ffec3869e10 <col:28> 'unsigned int (unsigned int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned int)' | `-BinaryOperator 0x7ffec3869eb0 <col:32, col:36> 'unsigned int' '-' | |-ImplicitCastExpr 0x7ffec3869e80 <col:32> 'unsigned int' <LValueToRValue> | | `-DeclRefExpr 0x7ffec3869e38 <col:32> 'unsigned int' lvalue ParmVar 0x7ffec382d3d0 'n' 'unsigned int' | `-ImplicitCastExpr 0x7ffec3869e98 <col:36> 'unsigned int' <IntegralCast> | `-IntegerLiteral 0x7ffec3869e60 <col:36> 'int' 2
Is it possible to improve the accuracy of debug metadata or allow the corresponding instruction in a different way? Ideally, I would like to leave Clang untouched, i.e. Do not modify or recompile it.