Dynamic Pruning for Parsimonious CNN Inference on Embedded Systems

Busia, P.; Theodorakopoulos, I.; Pothos, V.; Fragoulis, N.; Meloni, P.

doi:10.1007/978-3-031-12748-9_4

As a consequence of the current edge-processing trend, Convolutional Neural Networks (CNNs) deployment has spread to a rich landscape of devices, highlighting the need to reduce the algorithm’s complexity and exploit hardware-aided computing, as two prospective ways to improve performance on resource-constrained embedded systems. In this work, we refer to a compression method reducing a CNN computational workload based on the complexity of the data to be processed, by pruning unnecessary connections at runtime. To evaluate its efficiency when applied on edge processing platforms, we consider a keyword spotting (KWS) task executing on SensorTile, a low-power microcontroller platform by ST, and an image recognition task running on NEURAghe, an FPGA-based inference accelerator. In the first case, we obtained a 51% average reduction of the computing workload, resulting in up to 44% inference speedup, and 15% energy-saving, while in the latter, a 36% speedup is achieved, thanks to a 44% workload reduction. © 2022, Springer Nature Switzerland AG.

Dynamic Pruning for Parsimonious CNN Inference on Embedded Systems

Busia P.;Theodorakopoulos I.;Pothos V.;Fragoulis N.;Meloni P.

2022-01-01

Abstract

As a consequence of the current edge-processing trend, Convolutional Neural Networks (CNNs) deployment has spread to a rich landscape of devices, highlighting the need to reduce the algorithm’s complexity and exploit hardware-aided computing, as two prospective ways to improve performance on resource-constrained embedded systems. In this work, we refer to a compression method reducing a CNN computational workload based on the complexity of the data to be processed, by pruning unnecessary connections at runtime. To evaluate its efficiency when applied on edge processing platforms, we consider a keyword spotting (KWS) task executing on SensorTile, a low-power microcontroller platform by ST, and an image recognition task running on NEURAghe, an FPGA-based inference accelerator. In the first case, we obtained a 51% average reduction of the computing workload, resulting in up to 44% inference speedup, and 15% energy-saving, while in the latter, a 36% speedup is achieved, thanks to a 44% workload reduction. © 2022, Springer Nature Switzerland AG.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Codice ISBN
	
				978-3-031-12747-2
978-3-031-12748-9
			
	Parole chiave
	
				Convolutional Neural Networks; Hardware acceleration; Pruning
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
DASIP22_iris.pdf accesso aperto Descrizione: AAM Tipologia: versione post-print (AAM) Dimensione 480.53 kB Formato Adobe PDF Visualizza/Apri	480.53 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/345340

Citazioni

ND

2

1

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

UNICA IRIS Institutional Research Information System

Dynamic Pruning for Parsimonious CNN Inference on Embedded Systems

Busia P.;Theodorakopoulos I.;Pothos V.;Fragoulis N.;Meloni P.

2022-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)