64-bit 4-wide Out-of-Order Processor core

  • RV64IMAFDCSU Instruction set
  • Support 9 Stage pipeline (Fetch, Predecode, Checking, Decode, Rename, Dispatch, Issue, Execute and Retire)
  • Decode and Issue up to 4 instructions per cycle Out-of-Order execution and In-Order commit using ROB
  • Six execution units with split functionality
  • Branch prediction using BTB and 2-bit direction predictor
  • Register renaming to avoid dependency
  • 16 KiB 4-Way set-associative VIPT I-Cache & PIPT D-Cache

A 32 bit, 5-stage Pipeline Scalar RISC-V

  • Supports RV32G (RV32IMAFD) instructions
  • Integrated Floating-Point Unit
  • Split I and D (Instruction and Data) caches
  • Virtual Memory, Data and Instruction TLBs (DTLB & ITLB)
  • Branch Prediction Unit
  • System Counters
  • Interrupt controller with four levels of pre-emptive priority
  • Main Memory with Error Control Coding
  • Wishbone B.3 Bus Protocol

A 32 bit, 5-stage Dual-pipeline Superscalar RISC-V

  • 32 bit, 5-stage dual-pipeline superscalar processor based on RISC-V ISA
  • Integrated Floating-Point Unit
  • I and D (Instruction and Data) caches
  • Virtual Memory Support
  • Dynamic Branch Prediction Unit
  • Interrupt controller
  • Main Memory with Error Control Coding
  • UART

A Softcore RISC-V Vector Processor for Edge AI

RISC-V OoO Vector Processor for ML Inference 

High-Level Synthesis of Geant4 Particle Transport Application for FPGA

Photon Transport System

  • Host: Intel Xeon Silver CPU @ 2.1 GHz
  • FPGA: Xilinx Alveo U250
  • Kernel frequency: 300 MHz
  • Processes Implemented: Compton Scattering, Transportation
  • Features: Parallel particle simulation, flexible host code for multiple CUs, burst read/writes on DDR

Runtime Programmable Coprocessor for DCNN

  • Runtime Programmable
  • Computational throughput of more than 140 G operations/s
  • Max image size: 256 × 256, filter size: 16 × 16, Convolution stride: 1
  • Maximum of 32 layers in network and 256 filters in a layer.
  • Rectified linear unit (ReLU) as nonlinearity and max-pooling of size 2 × 2 and stride 2