Browsing Theses by Supervisor "Kapre, Nachiket"

Fair and Efficient Resource Scheduling in Heterogeneous Multi-Agent Systems

Omidi, Mohammadhadi (University of Waterloo, 2024-02-22)

The performance of machine-learning applications heavily relies on the choice of the underlying hardware architecture, encompassing factors such as computational power, scalability, memory, and storage capabilities. These ...

Fairness Notions on Hardware Resource Configuration

Vellora Vayalapra, Aravind (University of Waterloo, 2023-10-26)

To meet performance and energy efficiency demand of modern workloads, specialized hardware accelerators implemented on FPGAs or ASICs have found adoption in modern servers and Systems-on-Chip (SoC). These hardware accelerators ...

HopliteBuf FPGA Network-on-Chip: Architecture and Analysis

Garg, Tushar (University of Waterloo, 2019-04-22)

We can prove occupancy bounds of stall-free FIFOs used in deflection-free, low-cost, and high-speed FPGA overlay Network-on-chips (NoCs). In our work, we build on top of the HopliteRT livelock-free overlay NoC with an ...

Implementing FPGA-optimized Systolic Arrays using 2D Knapsack and Evolutionary Algorithms

Chan, Long Chan (University of Waterloo, 2022-01-25)

Underutilization of FPGA resources is a significant challenge in deploying FPGAs as neural network accelerators. We propose an FPGA-optimized systolic array architecture improving the CNN inference throughput by orders ...

Managing HBM’s bandwidth in Multi-Die FPGAs using Overlay NoCs

Kuttuva Prakash, Srinirdheeshwar (University of Waterloo, 2022-01-18)

We can improve HBM bandwidth distribution and utilization on a multi-die FPGA like Xilinx Alveo U280 by using Overlay Network-on-Chips (NoCs). HBM in Xilinx Alveo U280 offers 8GBs of memory capacity with a theoretical ...

Mocarabe: High-Performance Time-Multiplexed Overlays for FPGAs

Mellat, Alireza (University of Waterloo, 2022-01-27)

Coarse-grained reconfigurable array (CGRA) overlays can improve dataflow kernel throughput by an order of magnitude over Vivado HLS on Xilinx Alveo U280. This is possible with a combination of carefully floorplanned ...

NengoFPGA: an FPGA Backend for the Nengo Neural Simulator

Morcos, Benjamin (University of Waterloo, 2019-08-22)

Low-power, high-speed neural networks are critical for providing deployable embedded AI applications at the edge. We describe a Xilinx FPGA implementation of Neural Engineering Framework (NEF) networks with online learning ...

Worst-Case Latency Analysis for the Versal Network-on-Chip

Elmor Lang, Ian (University of Waterloo, 2022-01-18)

The recent line of Versal FPGA devices from Xilinx Inc. includes a hard Network-On-Chip (NoC) embedded in the programmable logic, designed to be a high-performance system-level interconnect. While the target markets for ...