LLVM IR Instructions' trick
LLVM IR instructions are the basic elements while we are writing LLVM PASS codes. Here are some interesting experiences.
A basic classification of the commonly used LLVM IR instructions
Here is a brief classification of the instructions based on the operand numbers. This classification would be useful when you want to analyze the operands and result op of instructions.
The type in different positions in the instruction usually needs to be guaranteed to be the same (be more careful especially when you need to change the type of an instruction).
Alloca Instruction
1 | %i = alloca type |
The type of this instruction is `type*.
Call Instruction
There are two different kinds of call instruction in LLVM.
1 | %i = call retType @func_name (type %p1, ...) |
and
1 | call void @llvm.dbg.declare/value (metadata type %p, ...) |
The second one is quite interesting and will be explained in the next section.
Load Instruction
1 | %i = load type, type* %op |
The type of this instruction is type
, but not type*
.
Store Instruction
1 | store type %op1, type* %op2 |
GetElementPtr Instruction
1 | %i = gep type, type1* %op1, type2 %op2, (type3 %op3) |
Binary Instruction
1 | %i = binaryInst type %op1, %op2 |
binaryInst
here is a representative word, it can be Add
, FAdd
, etc.
Unary Instruction
1 | %i = unaryInst type %op |
Same as the binaryInst
in the Binary Instruction
section, unaryInst
in the above code is a representative word, which can be FNeg
, etc.
Cast Instruction
1 | %i = castInst type1 %op1 to type2 |
Same as the binaryInst
in the Binary Instruction
section, castInst
in the above code is a representative word, it actually can be FPToUI
, FPToSI
, SIToFP
, UIToFP
, ZExt
, SExt
, FPExt
, Trunc
, FPTrunc
, BitCast
.
PHI Instruction
1 | %.i = phi type [%op1, %bb1], [%op2, %bb2], ... |
Get debug information from Instruction::Call
For the LLVM-define call instruction, like llvm.dbg.value
and llvm.dbg.declare
, we can easily get almost every debug information (as long as you compile with debug config, -O0 -g
) from the LLVM metadata.
Here’s a piece of code about how to get the debug information you want from IR.
1 | case Instruction::Call: |
Instructions that imply sign meanings
LLVM’s type system doesn’t explicitly specify the sign of the operand or instruction but uses different instructions to hint at the sign bit.
Suggest the sign symbol by the instruction name
LLVM uses UDiv
, URem
, and LShr
to calculate the unsigned operands and get the positive result. Correspondingly, SDiv
, Srem
, and AShr
are for the signed calculation.
Also, some type cast instructions, including FPToUI
, UIToFP
, and ZExt
mean the results of these instructions are unsigned values, FPToSI
, SIToFP
, and SExt
are on the opposite.
Sign hint in the ICmp instruction
There are hints like sgt
, sge
, slt
, and sle
in ICmp
to compare as the signed operands, hints like ugt
, uge
, ult
, and ule
are for the unsigned operands.
Warning flag in the instruction
nsw
(No Signed Wrap) and nuw
(No Unsigned Wrap) are flags to generate poison value if signed and/or unsigned overflow.
Copyright Notice
Anyone is free to use it, and please indicate the reference when using or publishing. Thank you!