IR(Intermediate Representation)
AST vs IR(Why IR?)
AST:
high-level and closed to grammar structure
usually language dependent
suitable for fast type checking
lack of control flow information
IR:
low-level and closed to machine code
usually language independent
compact and uniform
contains control flow information
usually considered as the basis for static analysis
Jimple Basic Concepts:
The four main method calls of JVM:
invokespecial: call constructor, call superclass methods, call private methods
invokevirutal: instance methods call (virtual dispatch)
invokeinterface: cannot optimization, checking interface implementation
invokestatic: call static methods
method signature: class name:return type method_name(parameter1 type,paramter 2 type...)
3AC(3-Address Code) and SSA
3AC:
There is at most one operator on the right side of an instruction
In 3AC, special control statements (such as GOTO and IF statements) are used to describe the program’s control flow
SSA:
Every variable has exactlly one definitiontion
in SSA, the program’s control flow is represented using basic blocks and phi functions. The phi function is defined at the entry of a basic block and is used to merge different variable values into one value to handle the basic block’s control flow
Basic Blocks(BB)
maximal sequences of consecutive three-address instructions with the properties that It can be entered only at the beginning and it can be exited only at the end
Control Flow Graph (CFG)
The nodes of CFG are basic blocks
Build CFG: