用lex与yacc构造汇编器vasm及其指令模拟器vsim
vasm及vsim源于Designing Digital Computer Systems with Verilog一书中定义的VeSPA(一个小型的RISC指令集的CPU)的指令集。
vasm通过两遍扫描的方式将VeSPA的汇编程序翻译为机器指令。
vsim模拟CPU的取指->译码->执行的循环完成机器指令的逐条执行,直到遇到停机或者运行时错误为止。
阅读及DIY该代码,你将深入理解并学会:1.两遍扫描的汇编器的工作原理,及其汇编器的设计和程序编写。2.lex与yacc工具在汇编器器和指令模拟器的构造过程中的应用。3.CPU的指令执行过程。4.对理解计算机体系结构有参考意义。5.提供了若干.asm汇编源程序样例以进行程序测试。
/////README file
##############################################################
# README file for VeSPA assembler & instruction simulation
# snallieATtomDOTcom
# Sat Nov 15 13:44:43 CST 2014
##############################################################
1.to build
# make
or
# make clean; make
then vasm and vsim be made
2.run vasm
# ./vasm ./asm_example/count.org.asm
then ./asm_example/count.org.hx produced
3. run vsim
# ./vsim ./asm_example/count.org.hx
snapshot of vsim as the following:
[root@rh9 vas]# ./vsim ./asm_example/count.org.hx
Designing Digital Computer Systems with Verilog, 2005
David J. Lilja and Sachin S. Sapatnekar
http://www.arctic.umn.edu/vespa/
VeSPA Instruction Interactive Simulator. version 0.01
snallieATtomDOTcom, Thu May 2 05:40:37 CST 2013
built: Nov 15 2014 - 13:51:25
hint: 'h' or '?' for help
CPU_endian = BIG
[r]>> u 0 1c
@0000 50000018 | LD r0 , 0x18
@0004 58400000 | LDI r1 , #0
@0008 08430001 | ADD r1 , r1 , #0x1
@000c 38020000 | CMP r1 , r0
@0010 46fffff4 | BLE 0x8
@0014 f8000000 | HLT
@0018 0000000a | NOP
@001c 00000000 | NOP
[w]>> w a
All watch registers enabled.
[w]>> t
0004: 58400000 LDI r1 , #0
@0008 08430001 ADD r1 , r1 , #0x1
PC =00000008 N=0 Z=0 C=0 V=0
r0 =0000000a r1 =00000000 r2 =00000000 r3 =00000000 r4 =00000000
r5 =00000000 r6 =00000000 r7 =00000000 r8 =00000000 r9 =00000000
r10=00000000 r11=00000000 r12=00000000 r13=00000000 r14=00000000
r15=00000000 r16=00000000 r17=00000000 r18=00000000 r19=00000000
r20=00000000 r21=00000000 r22=00000000 r23=00000000 r24=00000000
r25=00000000 r26=00000000 r27=00000000 r28=00000000 r29=00000000
r30=00000000 r31=00000000
instructions_consumed = 2
[t]>> t
0008: 08430001 ADD r1 , r1 , #0x1
@000c 38020000 CMP r1 , r0
PC =0000000c N=0 Z=0 C=0 V=0
r0 =0000000a r1 =00000001 r2 =00000000 r3 =00000000 r4 =00000000
r5 =00000000 r6 =00000000 r7 =00000000 r8 =00000000 r9 =00000000
r10=00000000 r11=00000000 r12=00000000 r13=00000000 r14=00000000
r15=00000000 r16=00000000 r17=00000000 r18=00000000 r19=00000000
r20=00000000 r21=00000000 r22=00000000 r23=00000000 r24=00000000
r25=00000000 r26=00000000 r27=00000000 r28=00000000 r29=00000000
r30=00000000 r31=00000000
instructions_consumed = 3
[t]>>
[u]>> ?
Designing Digital Computer Systems with Verilog, 2005
David J. Lilja and Sachin S. Sapatnekar
http://www.arctic.umn.edu/vespa/
VeSPA Instruction Interactive Simulator. version 0.01
snallieATtomDOTcom, Thu May 2 05:40:37 CST 2013
built: Nov 15 2014 - 13:51:25
hint: 'h' or '?' for help
CPU_endian = BIG
help menu (attention: command is case sensitive.):
b: set or clear break point, syntax: b addr_in_hex or b
c: continue the rest of the instructions
d: dump memory, syntax: d or d from to
e: edit memory, syntax: e or e addr_hex
g: go through the program
G: go to the breakpoint
h/?: this help
i: VeSPA Instruction Set
o: do over again
q/Q: quit
r: dump or edit register, syntax: r or r rn (n=0..31)
s: print current CPU state
t: trace single step or n step, syntax: t [n>0]
u: disassemble the program, syntax: u or u from to or u a
x: examine memory, syntax: x or x addr_hex_list_deli_with_space
w: watch register, syntax: w or w rn (n=0..31) or w rStart rEnd
[?]>>
4. the prebuild file vasm.exe and vsim.exe are for run under the DOS prompt.
5. For more information please refer to the PDF file named Designing Digital Computer Systems with Verilog
accompaning with this package, or visit http://www.arctic.umn.edu/vespa/ .
Enjoy fun!
/////misc.txt file
vasm & vsim
1.分号起始单行注释
ADD R1,R2,R3 ; single line comment
2.支持C/C++的块注释及其单行注释语法
/*
* C-style multiple line comment
* second line of C-style block comment
*/
// C++ style comment
3.全部指令集
Mnem OP_dec OP_bin Syntax Example MachCode Operation
==== ====== ====== ====================== ================= ======== ===============
NOP 0 00000 NOP NOP 00000000
ADD 1 00001 ADD rdst,rs1,rs2 ADD r2,r3,r5 08862800 r2 <- r3 + r5
ADD rdst,rs1,#immed16 ADD r6,r8,#0x62 09910062 r6 <- r8 + 0x62
SUB 2 00010 SUB rdst,rs1,rs2 SUB r2,r3,r5 10862800 r2 <- r3 - r5
SUB rdst,rs1,#immed16 SUB r6,r8,#0x62 11910062 r6 <- r8 - 0x62
OR 3 00011 OR rdst,rs1,rs2 OR r2,r3,r5 18862800 r2 <- r3 | r5
OR rdst,rs1,#immed16 OR r6,r8,#0x62 19910062 r6 <- r8 | 0x62
AND 4 00100 AND rdst,rs1,rs2 AND r2,r3,r5 20862800 r2 <- r3 & r5
AND rdst,rs1,#immed16 AND r6,r8,#0x62 21910062 r6 <- r8 & 0x62
NOT 5 00101 NOT rdst,rs1 NOT r2,r3 28860000 r2 <- ~r3
XOR 6 00110 XOR rdst,rs1,rs2 XOR r2,r3,r5 30862800 r2 <- r3 ^ r5
XOR rdst,rs1,#immed16 XOR r6,r8,#0x62 31910062 r6 <- r8 ^ 0x62
CMP 7 00111 CMP rs1,rs2 CMP r2,r3 38041800 r2 - r3
CMP rs1,#value CMP r2,#0x62 38050062 r2 - 0x62
Bxx 8 01000 Bxx -------- Bxx --------
JMP 9 01001 JMP rs1,#value JMP r16,#0x27 48200027 PC <- r16 + 0x27
JMPL 9 01001 JMPL rdst,rs1,#value JMPL r31,r16,#0x27 4FE10027 r31 <- PC; PC <- r16 + 0x27
LD 10 01010 LD rdst,LABEL LD r6,0x100 51800100 r6 <- Mem[0x100]
LDI 11 01011 LDI rdst,#value LDI r6,#0x2d 5980002D r6 <- 0x2d
LDX 12 01100 LDX rdst,rs1 LDX r6,r8 61900000 r6 <- Mem[r8]
LDX rdst,rs1,#value LDX r6,r8,#0x2e 6190002E r6 <- Mem[r8+0x2e]
ST 13 01101 ST LABEL,rst ST 0x100, r6 69800100 Mem[0x100] <- r6
STX 14 01110 STX rs1,rst STX r8,r6 71900000 Mem[r8] <- r6
STX rs1,#value,rst STX r8,#0x2e, r6 7190002E Mem[r8+0x2e] <- r6
MLT 15 01111 MLT rdst,rs1,rs2 MLT r2,r3,r5 78862800 r2 <- r3 * r5
MLT rdst,rs1,#immed16 MLT r6,r8,#0x62 79910062 r6 <- r8 * 0x62
HLT 31 11111 HLT HLT F8000000
BXX ----- MachCode Cond Condition
========= ======== ==== ======================================
BRA 0x100 400000FC 0000 always
BNV 0x100 440000F8 1000 never
BCC 0x100 408000F4 0001 carry clear (~C )
BCS 0x100 448000F0 1001 carry set (C)
BVC 0x100 410000EC 0010 overflow clear (~V )
BVS 0x100 450000E8 1010 overflow set (V)
BEQ 0x100 418000E4 0011 equal (Z)
BNE 0x100 458000E0 1011 not equal (~Z)
BGE 0x100 420000DC 0100 greater than or equal to (~N~V|NV)
BLT 0x100 460000D8 1100 less than (N(~V)|(~N)V)
BGT 0x100 428000D4 0101 greater than (~Z(~N~V|NV))
BLE 0x100 468000D0 1101 less than or equal to (Z|(N(~V)|(~N)V))
BPL 0x100 430000CC 0110 plus (positive) (~N )
BMI 0x100 470000C8 1110 minus (negative) (N)
========= ======== ==== ======================================
MOV rA,rB <=> ADD rA,rB,#0
ADD and SUB may affect Flags: NZCV
========= ======== ==== ======================================
alias of registers:
%fp(frame pointer):r28, %sp(stack pointer):r29, %ln(link register):r30, %rv(return value):r31
4.汇编器伪指令(不区分大小写,伪指令均以.为前缀):
.org
.word
.byte
.alloc
.align
.string
1). .org 设定下条指令或者数据的起始地址,如:
.org 0x200
ADD R1,R2,R3 ; 则ADD R1,R2,R3指令被汇编到内存的0x200地址处
.org 0x200
buf_data: .word 1,2,3 ; 则buf_data将被指定的内存地址为0x200,
2) .word 分配并初始化word长度(4字节)的内存,如:
.org 0x200
buf_data: .word 1,2,3 ; 则buf_data将被指定的内存地址为0x200,从buf_data开始的3个字的内存被初始化为1,2,3
.org 0x300
.word 0x100 ; 则0x300地址处分配一个字的内存,并被初始化为0x100
3). .byte分配并初始化byte长度(1字节)的内存,如:
.org 0x200
buf_data: .byte 1,2,3 ; 则buf_data将被指定的内存地址为0x200,从buf_data的3个字节的内存被初始化为1,2,3
4). .alloc N 分配N个字节的内存,不进行初始化,N为以个正整数,如:
.org 0x400
.alloc 10 ;在0x400地址处分配10个字节的内存空间,不进行初始化。
5). .align N 内存对齐。N可以取2,4,8三个值 如:
.align 4 ;若当前地址未能4字节对齐,则调整当前地址到随后的最近的一个4字节对齐地址上。
6). .string 定义一个C风格的字符串,不支持C的转义序列,如:
my_str: .string "this is a C-style string" ;则在my_str所代表的内存处存放了一个字符串"this is a C-style string"。
////sel_sort.asm 一个选择排序的示例程序
; sel_sort.asm
; selection_sort algorithm
; snallieATtomDOTcom
; Time-stamp: <2014-11-15 00:29:52 root>
; int main()
; {
; int data[] = { 11, 2, 13, 4, -1, 15, 6, 7, 8, 9 };
; int i, j;
; int min;
; int tmp;
; for (i = 0; i < 10; i++) {
; min = i;
; for (j = i; j < 10; j++) {
; if (data[j] < data[min]) {
; min = j;
; }
; }
; tmp = data[i];
; data[i] = data[min];
; data[min] = tmp;
; }
; for (j = 0; j < 10; j++) {
; printf("%d ", data[j]);
; }
; }
.org 0
CNT .EQU 40
start:
LDI r5 , #0 // j
LDI r4 , #0 // i
MOV r8 , r5
L1:
CMP r4 , #CNT-4
BEQ quit
LDX r7 , r5 , #data
MOV r1 , r7
L2:
LDX r7 , r5 , #data
CMP r7 , r1
BGE next_j
MOV r8 , r5
MOV r1 , r7 ; r1=Min
next_j:
ADD r5 , r5 , #0x4
CMP r5 , #CNT
BEQ next_i
BRA L2
next_i:
LDX r9 , r8 , #data
LDX r10, r4 , #data
STX r8 , #data, r10
STX r4 , #data, r9
ADD r4 , r4 , #0x4
MOV r5 , r4
MOV r8 , r5
BRA L1
quit:
HLT
data: .word 11,2,13,4,-1,15,6,7,8,9
1