Find Loops in LLVM Byte-Byte

I want to find simple loops in LLVM bytecode and extract basic loop information.

For instance:

for (i=0; i<1000; i++) sum += i; 

I want to extract the binding [0, 1000], the loop variable "i" and the loop body (sum + = i).
What should I do?

I read the LLVM API document and found useful classes such as "Loop", "LoopInfo".
But I do not know how to use them in detail.

could you help me? Detailed use may be more useful.

+5
source share
5 answers

If you do not want to use the skip manager, you may need to call the analysis method in the llvm :: LoopInfoBase class for each IR function (provided that you use LLVM-3.4). However, the Analyze method accepts the DominatorTree of each function as an input signal that you must first generate. The following codes are the ones I tested with LLVM-3.4 (assuming you read the IR file and convert it to module *, named as a module):

 for(llvm::Module::iterator func = module->begin(), y=module->end(); func!=y; func++){ //get the dominatortree of the current function llvm::DominatorTree* DT = new llvm::DominatorTree(); DT->DT->recalculate(*func); //generate the LoopInfoBase for the current function llvm::LoopInfoBase<llvm::BasicBlock, llvm::Loop>* KLoop = new llvm::LoopInfoBase<llvm::BasicBlock, llvm::Loop>(); KLoop->releaseMemory(); KLoop->Analyze(DT->getBase()); } 

Basically, generated by KLoop, you get all kinds of LOOP information at the IR level. See the API in the LoopInfoBase class for more details. By the way, you can add the following headers: "llvm / Analysis / LoopInfo.h" "llvm / Analysis / Dominators.h" .

+5
source

Once you reach the IR LLVM level, the information you request may be more inaccurate. For example, clang may have changed your code so that I go from -1000 to 0 instead. Or it can fully optimize "i", so there is no explicit inductive variable. If you really need to extract the information exactly as it says at face value in C code, you need to look at clang, not LLVM IR. Otherwise, the best thing you can do is to calculate the loop counter, in which case look at the ScalarEvolution pass.

Check the PowerPC hardware circuit transformation transition, which shows the trip count fairly well: http://llvm.org/docs/doxygen/html/PPCCTRLoops_8cpp_source.html

The code is pretty heavy, but should be consistent. An interesting feature is PPCCTRLoops :: convertToCTRLoop. If you have additional questions, I can try to answer them.

+2
source

LLVM is just a library. You will not find AST nodes there.

I suggest taking a look at Clang, which is a compiler built on top of LLVM.

Perhaps this one is what you are looking for?

+1
source

Like Matteo, in order for the LLVM to recognize the loop variable and condition, the file must be in LLVM IR. The question says that you have this in LLVM byte codec, but since LLVM IR is written in SSA form, talking about "loop variables" is actually not the case. I am sure that if you describe what you are trying to do and what result you expect, we can help.

Code to get you started:

  virtual void getAnalysisUsage(AnalysisUsage &AU) const{ AU.addRequired<LoopInfo>(); } bool runOnLoop(Loop* L, LPPassManager&){ BasicBlock* h = L->getHeader(); if (BranchInst *bi = dyn_cast<BranchInst>(h->getTerminator())) { Value *loopCond = bi->getCondition(); } return false; } 

This piece of code is inside a regular LLVM pass.

+1
source

Just updating Junxzm's answer, some links, pointers and methods have changed in LLVM 3.5 .

 for(llvm::Module::iterator f = m->begin(), fe=m->end(); f!=fe; ++f){ llvm::DominatorTree DT = llvm::DominatorTree(); DT.recalculate(*f); llvm::LoopInfoBase<llvm::BasicBlock, llvm::Loop>* LoopInfo = new llvm::LoopInfoBase<llvm::BasicBlock, llvm::Loop>(); LoopInfo->releaseMemory(); LoopInfo->Analyze(DT); } 
0
source

Source: https://habr.com/ru/post/1232062/


All Articles