Implementation of bytecode virtual machine in C++
View the Project on GitHub susantabiswas/bytecode-virtual-machine
This project is an implementation of the LC-3 virtual machine in C++.
Think of this virtual machine like the Java Virtual Machine (JVM). Instead of writing platform-specific code, we design a standard virtual machine architecture for the programming language. Regardless of the platform, the compiler or interpreter generates bytecode for this standard VM architecture. This means there is no need to have different versions of the codebase for ARM, x86, Windows, etc. The virtual machine then runs this bytecode and executes the program, ensuring compatibility across various platforms.
+---------------------+
| Source Code (C, |
| Python, Java, etc.)|
+----------+----------+
|
v
+----------+----------+
| Compiler/Interpreter|
| (Converts to Bytecode|
| or Machine Code) |
+----------+----------+
|
v
+----------+----------+
| Bytecode/Machine |
| Code |
+----------+----------+
|
v
+----------+----------+
| Virtual Machine |
| (Executes Bytecode)|
+----------+----------+
|
v
+----------+----------+
| Hardware (CPU, |
| Memory, etc.) |
+---------------------+
For instruction specifications and more details, you can follow
LC-3 specification
git clone https://github.com/susantabiswas/Systems.git
cd Systems/VirtualMachine/
We can run LC-3 machine code on this VM, we will use an ASCII version of the 2048 console game and run on this VM.
g++ lc3_vm.cpp -o lc3
# ./lc3 <path to lc3 assembled binary file>
# you can use the sample 2048.obj file provided in assets
./lc3 assets/2048.obj
Control the game using WASD keys.
Are you on an ANSI terminal (y/n)?
+--------------------------+
| |
| 2 |
| |
| 2 |
| |
+--------------------------+
| |
| |
| |
| 2 8 2 |
| |
| 4 16 8 |
| |
| 8 2 4 2 |
| |
+--------------------------+
Received signal: 2
The VM simulates the behavior of the LC-3 CPU, including instruction fetch, decode, and execute cycles. Below is a summary of the key components and flow of the VM:
+---------------------+
| Memory |
|---------------------|
| Address Space: |
| 0x0000 - 0xFFFF |
+---------------------+
|
v
+---------------------+
| Program Counter |
| (PC) |
+---------------------+
|
v
+---------------------+
| Instruction Register|
| (IR) |
+---------------------+
|
v
+---------------------+
| Control Unit |
+---------------------+
|
v
+---------------------+
| Arithmetic Logic |
| Unit (ALU) |
+---------------------+
|
v
+---------------------+
| General Purpose |
| Registers |
| R0, R1, R2, R3, |
| R4, R5, R6, R7 |
+---------------------+
|
v
+---------------------+
| Condition Flags |
| (Positive, Zero, |
| Negative) |
+---------------------+
|
v
+---------------------+
| Input/Output |
| (Keyboard, Display)|
+---------------------+
+---------------------+
| Memory |
| 0000 - Trap Vector |
| Table |
| 00FF |
| 0100 - Interrupt |
| Vector Table |
| 01FF |
| 0200 - OS & Kernel |
| Stack |
| 2FFF |
| 3000 - User Program |
| FDFF |
| FFE0 - Memory Mapped|
| Registers |
| FFFF |
+---------------------+
| Memory Map |
+---------------------+
+---------------------+
| Instruction Fetch:
| Load from memory
+---------------------+
|
v
+---------------------+
| Instruction Decode:
| Parse the byte instr
| and take out opcode
+---------------------+
|
v
+---------------------+
| Instruction Execute:
| Eval using opcode
+---------------------+
|
v
+---------------------+
| Update PC & Flags |
+---------------------+
|
v
+---------------------+
| Next Instruction |
+---------------------+
This diagram shows the cyclic nature of the instruction fetch-decode-execute process in the VM. Each instruction updates the PC and condition flags before moving to the next instruction.
Note: LC-3 uses big-endian encoding against the little-endian encoding that most of today's computers run on, so it is essential to convert the big-endian word to little-endian before doing anything.
In the LC-3 virtual machine, terminal settings related methods are used to handle input and output operations, particularly for managing the terminal’s behavior during execution. These methods are essential for simulating the LC-3’s interaction with the user and ensuring smooth operation of the VM. Here are the key methods and their purposes:
disable_input_buffering():
Disables the terminal’s input buffering and echoing.
This ensures that input characters are immediately available to the VM without waiting for the Enter key, which is crucial for real-time input handling.
restore_input_buffering():
Restores the terminal’s input buffering and echoing to their default settings.
This ensures that the terminal behaves normally after the VM has finished executing
handle_interrupt(int signal):
This ensures that the VM can clean up resources and restore terminal settings before exiting, providing a better user experience and preventing terminal misbehavior.