IR(Intermediate Representation)

AST vs IR(Why IR?)

AST:

high-level and closed to grammar structure

usually language dependent

suitable for fast type checking

lack of control flow information

IR:

low-level and closed to machine code

usually language independent

compact and uniform

contains control flow information

usually considered as the basis for static analysis

Jimple Basic Concepts:

The four main method calls of JVM:

invokespecial: call constructor, call superclass methods, call private methods

invokevirutal: instance methods call (virtual dispatch)

invokeinterface: cannot optimization, checking interface implementation

invokestatic: call static methods

method signature: class name:return type method_name(parameter1 type,paramter 2 type...)

3AC(3-Address Code) and SSA

image-20230504214854539

3AC:

There is at most one operator on the right side of an instruction

image-20230504215836395

In 3AC, special control statements (such as GOTO and IF statements) are used to describe the program’s control flow

SSA:

Every variable has exactlly one definitiontion

in SSA, the program’s control flow is represented using basic blocks and phi functions. The phi function is defined at the entry of a basic block and is used to merge different variable values into one value to handle the basic block’s control flow

image-20230504220458638

Basic Blocks(BB)

maximal sequences of consecutive three-address instructions with the properties that It can be entered only at the beginning and it can be exited only at the end

image-20230504225624006

Control Flow Graph (CFG)

The nodes of CFG are basic blocks

Build CFG:

image-20230504230000182