Front-end Analysis/Compilation

Stages


There are four stages of front-end analysis. They are lexical analysis, syntax analysis, semantic analysis and intermediate code generation.

Frequent access to a symbol table is required during front-end analysis. A symbol table is a data structure that stores the name and attributes, such as the data type and the scope, of every identifier.


Lexical Analysis


The first stage of front-end analysis is lexical analysis. Lexical analysis separates each line of source code into tokens, also known as lexemes. Each token consists of a token name and a token value. All identifiers must be entered into the symbol table.

For example, the following declaration statement:
Var Num : integer;
will be separated into five tokens:
Var, Num, :, integer, ;

Examples of token names and token values:

Token Name Token Value
Identifier count, x, colour
Keyword while, if, return
Separater (, {, ;
Operator +, =, <
Literal 123, "blue", true
Comment // Here, /* Here */

Syntax Analysis


Syntax analysis, also known as parsing, is the process of analysing a sequence of tokens according to the formal grammar of the programming language. The result of the syntax analysis is the construction of a syntax tree or parse tree.

For example, the following assignment statement:
y := 2 * x + 4
will be parsed into a tree:


Semantic Analysis


Semantic analysis is a process which gathers necessary logical information from the source code. It performs tasks such as type checking, object binding, etc. An annotated abstract syntax tree is constructed to record these information.


Intermediate Code Generation


An intermediate code is generated from the annotated abstract syntax tree. The intermediate code is a completely accurate representation of the source code that is designed to be useful for further processing such as optimisation and translation.

Show Comments