Embedded computing systems are rapidly shifting from general-purpose single-core architectures towards heterogeneous multi-core platforms. The first reasons behind this shift were the well-known end of Dennard Scaling and Moore’s law. These forced the adoption of complex architecture, such as Multi-core CPUs, GPUs, and FPGAs to keep pace with increasing computational power demanded by applications. On top of that, in recent years we have seen the rise of numerous challenges beyond the traditional performance scalability. Flexibility, adaptivity, and sustainability are needed to cope with evolving applications while avoiding, or at least delaying, technology obsolescence. In such a context, the reconfigurability exposed by FPGA-based platforms is becoming increasingly appealing for hardware and software developers. The adoption of heterogeneous reconfigurable architectures does not come free. Porting applications to such complex platforms still requires advanced hardware and software knowledge to obtain the desired levels of performance and energy efficiency. High-level Synthesis tools are a promising solution for bridging the gap between software developers, used to describe applications through programming languages and high-level APIs, and FPGA platforms, which require hardware specifications for their configuration. However, these tools are effective in translating imperative code to a hardware specification, but they poorly address the non-functional requirements that encompass power optimization, performance adaptivity, and execution flexibility. Leveraging the potential of dataflow models and coarse-grained reconfigurable accelerators is a valuable way to address the aforementioned challenges. Dataflow modeling allows the capture of parallelism, concurrency, and modularity of applications. It matches well with coarse-grained reconfigurable accelerators that provide a flexible architectural solution to target FPGA substrates. Coarse-grained reconfigurability can be modeled at higher abstraction levels than the FPGA fine-grained one, hiding some complexity of the hardware platform and ensuring faster reconfiguration times. This dissertation focuses on several aspects of coarse-grained reconfigurable accelerators and their integration with high-level synthesis tools. It investigates the effectiveness of applying power-saving techniques to such architectures developed with high-level synthesis. Adaptivity support is provided to coarse-grained reconfigurable accelerators designed to accelerate convolutional neural networks. Execution flexibility is delivered by enabling the combination of multi-thread and multi-task capabilities. For each result, different degrees of design automation are provided to ease the adoption of the developed solutions.
Design methodologies and architectures for application-specific coarse-grain reconfigurable accelerators
RATTO, FRANCESCO
2024-02-16
Abstract
Embedded computing systems are rapidly shifting from general-purpose single-core architectures towards heterogeneous multi-core platforms. The first reasons behind this shift were the well-known end of Dennard Scaling and Moore’s law. These forced the adoption of complex architecture, such as Multi-core CPUs, GPUs, and FPGAs to keep pace with increasing computational power demanded by applications. On top of that, in recent years we have seen the rise of numerous challenges beyond the traditional performance scalability. Flexibility, adaptivity, and sustainability are needed to cope with evolving applications while avoiding, or at least delaying, technology obsolescence. In such a context, the reconfigurability exposed by FPGA-based platforms is becoming increasingly appealing for hardware and software developers. The adoption of heterogeneous reconfigurable architectures does not come free. Porting applications to such complex platforms still requires advanced hardware and software knowledge to obtain the desired levels of performance and energy efficiency. High-level Synthesis tools are a promising solution for bridging the gap between software developers, used to describe applications through programming languages and high-level APIs, and FPGA platforms, which require hardware specifications for their configuration. However, these tools are effective in translating imperative code to a hardware specification, but they poorly address the non-functional requirements that encompass power optimization, performance adaptivity, and execution flexibility. Leveraging the potential of dataflow models and coarse-grained reconfigurable accelerators is a valuable way to address the aforementioned challenges. Dataflow modeling allows the capture of parallelism, concurrency, and modularity of applications. It matches well with coarse-grained reconfigurable accelerators that provide a flexible architectural solution to target FPGA substrates. Coarse-grained reconfigurability can be modeled at higher abstraction levels than the FPGA fine-grained one, hiding some complexity of the hardware platform and ensuring faster reconfiguration times. This dissertation focuses on several aspects of coarse-grained reconfigurable accelerators and their integration with high-level synthesis tools. It investigates the effectiveness of applying power-saving techniques to such architectures developed with high-level synthesis. Adaptivity support is provided to coarse-grained reconfigurable accelerators designed to accelerate convolutional neural networks. Execution flexibility is delivered by enabling the combination of multi-thread and multi-task capabilities. For each result, different degrees of design automation are provided to ease the adoption of the developed solutions.File | Dimensione | Formato | |
---|---|---|---|
thesis.pdf
embargo fino al 15/02/2025
Descrizione: Tesi
Tipologia:
Tesi di dottorato
Dimensione
5.7 MB
Formato
Adobe PDF
|
5.7 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.