Research Interests
- VLSI design
- Hardware Accelerators
- Computer Architecture
- In Memory Computing
Research Experience
Research Assistant, RPTU, Germany
- Contributed to a robust and efficient design of a low-power pipelined FFT architecture.
- Prepared lecture slides for an academic course.
- Led an LLM-guided ASIC design of the Space Invaders Game. The project involved RTL Verilog development, automated RTL-to-GDSII generation, and successful tapeout through TinyTapeout.
Projects
CrocSoC
Full ASIC implementation flow for the CrocSoC design, including functional verification, logic synthesis, floorplanning, placement and routing, and clock tree synthesis using open-source tools.
CNN
Designed and implemented a pipelined CNN accelerator supporting a single stage convolution layer, a max pooling layer, and a fully connected layer.

CGRA - Coarse-Grained Reconfigurable Array
This project explores the design of a Coarse-Grained Reconfigurable Array (CGRA) based on an open-source OpenEdgeCGRA project, focusing on efficient processing element architecture and reconfigurable interconnects for accelerated computing.

LLM for Chip design - Space Invaders Game
The Space Invaders project shows how Generative AI can enhance productivity in chip design. We developed a methodology in which prompt descriptions are refined multiple times to generate Verilog code of the game. The design was tapedout using the TinyTapeout platform, resulting in an area of 0.064 mm² and operates at 25 MHz.




RV32I - RISC-V Processor
This project involves the design and implementation of an RV32I RISC-V instruction set architecture processor, with a focus on optimized pipeline stages and functional correctness. The CPU has pipeline hazard detection and resolution features.

Data Cache Controller
Implemented a direct-mapped, write-through cache and controller in Verilog. It is designed for a single-cycle processor system, this cache efficiently interfaces with an n-cycle latency main memory and optimizes data access.
Specifications & Features:
- The design comprises three main submodules: a) Cache controller with tag array, b) Cache data array, and c) Main memory interface.
- Utilizes a direct-mapped architecture with a write-through, no-write-allocate policy.
- Features a 16-byte (128-bit) cache block size.
- Configurable with 4 cache lines (parameterized for flexibility).
- Supports 32-bit processor data and address buses.
- Integrated with an interleaved memory system, consisting of four 32-bit memory banks.

Learning Resources
Education
- Joint M.Sc. in Embedded Computing Systems, University of Southampton.
- Joint M.Sc. in Embedded Computing Systems, RPTU.
- B.Sc. in Electrical and Electronics Engineering, EIT.