CS61C Project 2MIPS Instruction Set Simulator
TA: Eric Liang (code adapted from framework by Yunsup and Andrew W.)
Intermediate milestone due Sunday Feb 19 @ 23:59:59 PST
Project due Sundary Feb 26 @ 23:59:59 PST
Updates
- 2/23 - Fixed a bug in the Makefile which was causing incorrect traces to be generated for rt30.
- 2/22 - Four more autograder test cases have been released. You may obtain them by pulling as usual.
- 2/22 - Grades for part 1 are now on glookup.
- 2/18 - Updated failsafe git fix instructions to cover empty proj2 dir case.
- 2/16 - Fixed another bug in submit check script: run 'git pull proj2 master' to update.
- 2/16 - Github permissions all fixed - you can push now.
- 2/14 - Changed access_ok() spec to require returning 0 for NULL dereferences.
Background
In this project, you will create an instruction interpreter for a subset of the MIPS instruction set. You’ll provide the machinery to decode and execute a couple dozen MIPS instructions. You're creating what is effectively a miniature version of MARS, but better yet, you’ll be able to run real C programs on your MIPS simulator!
The MIPS green sheet provides information necessary for completing this project.
Getting started
This is an individual assignment. All of your work must be yours alone.
This project requires a functional mips-gcc to be installed. The only computers that will work for this project are those in the hive lab. Here's a hive machine you can try logging in to:
$ ssh -X cs61c-XX@hive.cs.berkeley.edu
Navigate to your git repo and add the project2 template as a remote. Failsafe way of fixing a broken git setup
$ cd <REPO>
$ git remote rm proj2 # only if you've messed up remote add before, otherwise you don't have to do this
$ git remote add proj2 git://github.com/ucberkeley-cs61c/proj2.git
$ git pull proj2 master
This should have pulled the project files into your repo. If the template is updated, you can re-run the last command to pull in the latest changes, which will have git automatically merge in the updates.
If you are able, run 'gitk' for a graphical visualization of the merge. If all has gone well, you should be looking at something like this, where 'master' refers to your current branch and 'remotes/origin/master' is the copy on github:
The proj2 directory includes the following files. You need to modify the files colored blue. You may find it helpful to look at the files colored black. You definitely do not need to look at the files colored red.
- sim.c: C source file which contains the top level simulator. This file contains the main function.
- processor.c: C source file which implements the processor.
- processor.h: C header file for processor.
- memory.c: C source file which implements the memory interface.
- memory.h: C header file for memory.
- disassemble.c: C source file which implements disassemble.
- disassemble.h: C header file for disassembler.
- load_program.c: C source file which implements loading an executable file.
- load_program.h: C header file for program loader.
- elf.h: C header file for ELF definitions. (ELF is the standard executable file format in Linux.)
- Makefile: File which records all dependencies.
- mipscode/*: Assembly and C source files for tests.
Project
The files provided in the start kit comprise a framework for a MIPS simulator. You’ll complete the program by adding code to processor.c and memory.c to execute each instruction and perform memory access. Additionally, you’ll add code to disassemble.c to print out the human-readable disassembly corresponding to the instruction’s machine code. Your simulator must be able to simulate the machine code versions of the following MIPS machine instructions. We’ve already implemented the pink-colored instructions for you, so you have an example of an R-type, I-type, and J-type instruction.
It is critical that you read and understand the definitions in processor.h before starting the project. If they look mysterious, re-read chapter 6 of K&R, which covers structs, bitfields, and unions. Check yourself: why does sizeof(inst_t) == 4?
|
INSTRUCTION |
TYPE |
OPCODE |
FUNCT |
OPERATION |
|
sll rd,rt,shamt |
R |
0x0 |
0x0 |
R[rd] <- R[rt] << shamt |
|
srl rd,rt,shamt |
R |
0x0 |
0x2 |
R[rd] <- R[rt] >> shamt |
|
sra rd,rt,shamt |
R |
0x0 |
0x3 |
R[rd] <- (signed)R[rt] >> shamt |
|
jr rs |
R |
0x0 |
0x8 |
PC <- R[rs] |
|
jalr rd,rs |
R |
0x0 |
0x9 |
tmp <- PC + 4 PC <- R[rs] R[rd] <- tmp |
|
syscall |
R |
0x0 |
0xc |
(perform system call) |
|
addu rd,rs,rt |
R |
0x0 |
0x21 |
R[rd] <- R[rs] + R[rt] |
|
subu rd,rs,rt |
R |
0x0 |
0x23 |
R[rd] <- R[rs] - R[rt] |
|
and rd,rs,rt |
R |
0x0 |
0x24 |
R[rd] <- R[rs] & R[rt] |
|
or rd,rs,rt |
R |
0x0 |
0x25 |
R[rd] <- R[rs] | R[rt] |
|
xor rd,rs,rt |
R |
0x0 |
0x26 |
R[rd] <- R[rs] ^ R[rt] |
|
nor rd,rs,rt |
R |
0x0 |
0x27 |
R[rd] <- ~(R[rs] | R[rt]) |
|
slt rd,rs,rt |
R |
0x0 |
0x2a |
R[rd] <- (signed)R[rs] < (signed)R[rt] |
|
sltu rd,rs,rt |
R |
0x0 |
0x2b |
R[rd] <- R[rs] < R[rt] |
|
j target |
J |
0x2 |
- |
PC <- {(upper 4 bits of PC+4), address*4} |
|
jal target |
J |
0x3 |
- |
R[31] <- PC + 4 PC <- {(upper 4 bits of PC+4), address*4} |
|
beq rs,rt,offset |
I |
0x4 |
- |
if(R[rs] == R[rt]) PC <- PC + 4 + signext(offset)*4 |
|
bne rs,rt,offset |
I |
0x5 |
- |
if(R[rs] != R[rt]) PC <- PC + 4 + signext(offset)*4 |
|
addiu rt,rs,imm |
I |
0x9 |
- |
R[rt] <- R[rs] + signext(imm) |
|
slti rt,rs,imm |
I |
0xa |
- |
R[rt] <- (signed)R[rs] < signext(imm) |
|
sltiu rt,rs,imm |
I |
0xb |
- |
R[rt] <- R[rs] < signext(imm) |
|
andi rt,rs,imm |
I |
0xc |
- |
R[rt] <- R[rs] & zeroext(imm) |
|
ori rt,rs,imm |
I |
0xd |
- |
R[rt] <- R[rs] | zeroext(imm) |
|
xori rt,rs,imm |
I |
0xe |
- |
R[rt] <- R[rs] ^ zeroext(imm) |
|
lui rt,imm |
I |
0xf |
- |
R[rt] <- imm << 16 |
|
lb rt,offset(rs) |
I |
0x20 |
- |
R[rt] <- signext(LoadMem(R[rs]+signext(offset), byte)) |
|
lh rt,offset(rs) |
I |
0x21 |
- |
R[rt] <- signext(LoadMem(R[rs]+signext(offset), half)) |
|
lw rt,offset(rs) |
I |
0x23 |
- |
R[rt] <- LoadMem(R[rs]+signext(offset), word) |
|
lbu rt,offset(rs) |
I |
0x24 |
- |
R[rt] <- zeroext(LoadMem(R[rs]+signext(offset), byte)) |
|
lhu rt,offset(rs) |
I |
0x25 |
- |
R[rt] <- zeroext(LoadMem(R[rs]+signext(offset), half)) |
|
sb rt,offset(rs) |
I |
0x28 |
- |
StoreMem(R[rs]+signext(offset)], byte, R[rt]) |
|
sh rt,offset(rs) |
I |
0x29 |
- |
StoreMem(R[rs]+signext(offset)], half, R[rt]) |
|
sw rt,offset(rs) |
I |
0x2b |
- |
StoreMem(R[rs]+signext(offset)], word, R[rt]) |
If you have any questions about the semantics of the instructions, consult P&H (4th ed.) Section 2.9 and Appendix B, particularly if you find our pseudocode for loads and stores to be confusing. (The first argument to LoadMem and StoreMem in our pseudocode is the memory address. The second argument to both LoadMem and StoreMem is the width of the access. The third argument to StoreMem is the store data.)
The MIPS system you’re implementing is little-endian. Look at P&H (4th edition) page B-43 for further information on endianness (byte order). We chose little endian because the host machines (x86 hive computers) are little endian, and this avoids some complexity for sub-word loads and stores.
In our variant of MIPS, beq and bne are not delayed branches. (If you don’t know what that means, just ignore it; your default assumption about the behavior of branches is probably correct for this project. We’ll talk about this topic later in the semester.)
Initially, you will be able to run a program called “simple”, which only uses the pink colored instructions, by running the following commands:
$ cd <REPO>/proj2
$ make
$ ./mips-sim mipscode/simple
Hello, world!
exiting the simulator
Framework code
The framework code we’ve provided begins by doing the following.
- It reads the executable program binary, whose filename is specified on the command line, into the simulated memory, starting at address 0x00001000.
- It initializes all 32 MIPS registers to 0 and sets the program counter (PC) to 0x00001000.
- It sets flags that govern how the program interacts with the user. Depending on the options specified on the command line, the simulator will either show a dissassembly dump of the program on the command line, or it will execute the program.
It then enters the main simulation loop, which simply calls execute_one_inst() repeatedly until the simulation is complete. execute_one_inst() performs the following tasks:
- It fetches an instruction from memory, using the PC as the address.
- It examines the opcode to determine what instruction was fetched.
- It executes the instruction and updates the PC.
The framework supports a handful of command-line options:
- -i runs the simulator in “interactive mode,” in which the simulator executes an instruction each time the Enter key is pressed. The disassembly of each executed instruction is printed.
- -t runs the simulator in “tracing mode,” in which each instruction executed is printed.
- -r instructs the simulator to print the contents of all 32 registers after each instruction is executed. It’s most useful when combined with the -i flag.
- -d instructs the simulator to disassemble the entire program, then quit.
Your Job
Your job is to complete the implementations of the disassemble() function in disassemble.c, the execute_one_inst() function in processor.c, and the load_mem() store_mem() access_ok() functions in memory.c. Right now, they only operate correctly for the or, ori, j, and syscall instructions. By the time you’re finished, they should handle all of the instructions in the table above.
Your first task will be to implement the disassembler. (15 points, due at the intermediate milestone)
- Print the instruction name. If the instruction has arguments, print a tab (\t).
- Print all arguments, following the order and formatting given in the INSTRUCTION column of the table above.
- Arguments are generally comma-separated (lw/sw, however, also use parentheses), but are not separated by spaces.
- Register arguments are printed as a $ followed by the register number, in decimal (e.g. $0 or $31).
- Zero-extended immedates (e.g. for ori) are printed as lowercase hexadecimal numbers with a leading 0x but without extra leading zeros (e.g. 0x0 or 0xffff).
- Jump addresses should be printed like zero-extended immediates, but multiplied by 4 first. (Assume the upper 4 bits of PC+4 are zero.)
- Sign-extended immediates (e.g. for addiu, sw, or beq) are printed as signed decimal numbers (e.g. -12 or 48). For branch offsets, multiply the offset by 4.
- Shift amounts (e.g. for sll) are printed as unsigned decimal numbers (e.g. 0 or 31).
- Print a newline (\n) at the end of an instruction.
- We will be using an autograder to grade this task. If your output differs from ours due to formatting errors, you will not receive credit.
- We have provided a disassembly test for you. Since a test is only covering a subset of possible scenarios, however, passing this test does not mean that your code is bug free. You should identify the corner cases and test them yourself.
Run the disassembly test by typing in “make disasmtest”. If your disassembly matches the output, you will get a “DISASSEMBLY TEST PASSED!” message.
$ cd <REPO>/proj2
$ make disasmtest
./mips-sim -d mipscode/insts > insts.dump
DISASSEMBLY TEST PASSED!
If your disassembly does not match the output, you will get the difference between the reference output and your output, and a “DISASSEMBLY TEST FAILED!” message. Make sure you pass this test before submitting your first task.
Reminder of how to git works.
Your local repository is
indicated by the rectangular outline.
Run 'git status' or 'gitk' at any
time to determine the state
of your repo.
$ cd <REPO>/proj2
$ make disasmtest
./mips-sim -d mipscode/insts > insts.dump
56c56
< 000010dc: addiu $30,$0,11
---
> 000010dc: addiu $30,$0,0x11
66c66
< 00001104: bne $4,$5,44
---
> 00001104: bne $4,$5, 44
DISASSEMBLY TEST FAILED!
This is the end of part 1 of the project. You have now done enough work for the intermediate checkpoint, due on the 19th. For the intermediate milestone, tag the point in history you want to submit as "proj2-1" and push your changes to github:
$ git tag -f proj2-1 <rev>
$ git push
$ git push --tags
$ make check-part1
The make check-part1 command will verify that you have submitted correctly. Only changes to proj2/disassemble.c will be considered by the autograder for the checkpoint. If you want to update your submission, just repeat the git tag -f command and push the updated tag. We will take a snapshot of all git repos at each submit deadline and again for late submissions. If you update the tag after the deadline, your project may be considered late.
Your second task will be to actually implement the remaining instructions in the processor. (85 points, due at the end)
One issue we have not addressed thus far is address
alignment for loads and stores. The simulator should abort if a memory access is improperly aligned or outside of the range [1,MEM_SIZE). The same behavior should occur for misaligned loads and stores. lw and sw require 4-byte alignment, like the PC.
lh, lhu, and sh require 2-byte alignment. lb, lbu, and sb are always properly aligned. For misaligned or
out-of-range addresses (this includes the NULL address 0), you must have access_ok() return 0 in memory.c, which will cause the simulator to print an error message and abort. This is analogous to the segmentation fault you may see when an x86 program does an illegal memory access.
We have provided a simple self-checking assembly test
that tests several of the instructions. However, the
test is not exhaustive and does not exercise every instruction.
Here’s how to run the test. (The output is from a working
processor.) $ cd <REPO>/proj2 $ make runtest ./mips-sim -r mipscode/insts > insts.trace RUN TEST PASSED! Most likely you will have bugs, so try the tracing mode or other debugging modes described in the Framework code section above.
We have provided a few more tests and the ability to write your own. We described simple above, which prints out “ Hello, world!”, and is written in assembly.
Here’s an example of how to add an assembly test: Now build your assembly test, and then run it by typing
in the following commands: $ cd <REPO>/proj2 $ make $ ./mips-sim mipscode/foo
You can, and indeed should, write your own assembly tests to test specific instructions and their corner cases. Furthermore, you should be compiling and testing your code after each group of instructions you implement. It will be very hard to debug your project if you wait until the end to test.
Additionally, you can write test programs in C and run
them on your simulator, once you’ve implemented all of the instructions! We
provided ackermann, which computes the
Ackermann function A(m,n) recursively. It will not work until you have
implemented the processor fully. This is a sample run of the program on a correct
processor: $ ./mips-sim mipscode/ackermann A(3, 3) = 61 exiting the simulator To write your own C tests, stick to a subset of C that
doesn’t use floating-point. Additionally, don’t use any C library
functions. Look at ackermann.c for an example of how to print things in our restricted environment.
Here’s how to do write a C test: Now build your C program, and then run it by typing in
the following commands: $ cd <REPO>/proj2 $ make $ ./mips-sim mipscode/bar For the final results, only changes to the files proj2/disassemble.c, proj2/processor.c, and proj2/memory.c will be considered by the autograder. Submit by pushing a tagged commit to your github account: $ git tag -f proj2-2 <rev> $ git push $ git push --tags $ make check-part2
$ ./mips-sim -t mipscode/simple
00001000: ori $4,$4,0x1014
00001004: ori $2,$0,0x4
00001008: syscall
Hello, world!
0000100c: ori $2,$0,0xa
00001010: syscall
exiting the simulator