Compiler Construction Experiment 1
Implementing a Scanner for TINY+
You are to write a lexical analyzer/scanner for the language TINY+.
Goals
1The input of the scanner is a source code file and the output of the scanner is a stream of tokens.
2Your scanner should go for longest possible match i.e. a string ‘:=’ is to be identified as ‘ass-symbol’ rather than ‘:’ and ‘=’.
3Token is represented as (Kind, Value). We use the following symbols to denote different kinds of tokens
KEY denotes reserved words
SYM denotes special symbols
ID denotes identifiers
NUM denotes numeric constants
STR denotes string constants
4Check lexical errors: giving meaning error messages and the lines where errors occur. The kinds of lexical errors are:
Illegal character, that is, scanner may recognize a character that is not in the alphabet of TINY+, such as $ is an illegal character
The right bracket of a STRING is lost, such as ' scanner
The right delimiter of a comment is lost, such as:
{this is an example
Requirements
1Write your program in C or C++
2This experiment must be finished in 4 periods. You will submit a report and the source code
Example output for some TINY+ programs
Test1
or and int bool char
while do if then else
end repeat until read write
, ; := + -
* / ( ) = a2c
123 'EFG'
The scanner should give the outputs:
(KEY, or) (KEY, and) (KEY, int) (KEY, bool)
(KEY, char) (KEY, while) (KEY, do) (KEY, if)
(KEY, then) (KEY, else) (KEY, end) (KEY, repeat)
(KEY, until) (KEY, read) (KEY, write) (SYM, ,)
(SYM, ;) (SYM, :=) (SYM, +) (SYM, -)
(SYM, *) (SYM, /) (SYM, ( ) (SYM, ))
(SYM, ) (SYM, =) (ID, a2c) (NUM, 123) (STR, EFG)
Test2
{this is an example}
int A,B;
bool C1, C2, C3;
char D;
D:= 'scanner';
while A<=B do
A:=A*2
end
The scanner should give the outputs:
(KEY, int) (ID, A) (SYM, ,) (ID, B)
(SYM, ;) (KEY, bool) (ID, C1) (SYM, ,)
(ID, C2) (SYM, ,) (ID, C3
1