Overview
Implemented a compact toolchain that compiles a small C‑like language into a memory‑oriented assembly, translates that into register‑based assembly via a Python cross‑compiler, and runs the resulting program on a 5‑stage pipelined processor simulator (Fetch, Decode, Execute, Memory, Write‑Back).
Key Challenges
Balancing a simple yet expressive source language, emitting efficient intermediate code, and mapping memory‑oriented instructions to a register machine while keeping the simulator accurate under pipeline hazards.
- Designing parsing rules and AST with LEX/YACC for a C‑like syntax
- Producing deterministic memory‑oriented assembly and optimizing temporaries
- Cross‑compiling memory ops into register sequences with load/store insertion
- Simulating pipeline hazards and inserting NOPs where needed
Implementation
Compiler front‑end: LEX for tokens and YACC for grammar → AST → intermediate code generator that emits lines in the form OP @A @B @C (32‑bit encoding). Optimizations include folding constants and reducing temporary variables.
Assembler & interpreter: memory‑oriented instructions executed by an interpreter for quick validation. Cross‑compiler: Python script that replaces memory operands with register sequences (LDR/STR) and keeps a small register allocator policy.
Processor simulator: modular 5‑stage pipeline, 16×8‑bit register bank, ROM for instructions, RAM for data, ALU supporting arithmetic/logical ops and flags (Z,N,O,C). A hazard manager detects RAW hazards and injects NOPs to preserve correctness.
Features
Full toolchain
Source → memory asm → cross‑compiler → register asm → pipeline simulator.
Optimizations
Constant folding and temporary reduction to produce leaner intermediate code.
Hazard management
Data hazard detection and NOP insertion to keep the pipeline correct.
Results
Validated full chain on representative programs: correct behavior verified by interpreter and simulator, and registerized code executed on the pipeline with hazard handling. The cross‑compiler produced predictable register sequences enabling pipeline execution without semantic regressions.
Lessons & next steps
- Designing small languages and code generation with LEX/YACC.
- Practical cross‑compilation strategies for memory → register translation.
- Future work: register allocation improvements, branch prediction emulation, and cycle‑accurate performance metrics.
Gallery