Chipmunk: A Systolically Scalable 0.9 mm${}^2$, 3.08 Gop/s/mW @ 1.2 mW Accelerator for Near-Sensor Recurrent Neural Network Inference.

RSS Source
Francesco Conti, Lukas Cavigelli, Gianna Paulin, Igor Susmelj, Luca Benini

Recurrent neural networks (RNNs) are state-of-the-art in voiceawareness/understanding and speech recognition. On-device computation of RNNson low-power mobile and wearable devices would be key to applications such aszero-latency voice-based human-machine interfaces. Here we present Chipmunk, asmall (<1 mm${}^2$) hardware accelerator for Long-Short Term Memory RNNs in UMC65 nm technology capable to operate at a measured peak efficiency up to 3.08Gop/s/mW at 1.24 mW peak power. To implement big RNN models without incurringin huge memory transfer overhead, multiple Chipmunk engines can cooperate toform a single systolic array. In this way, the Chipmunk architecture in a 75tiles configuration can achieve real-time phoneme extraction on a demanding RNNtopology proposed by Graves et al., consuming less than 13 mW of average power.

Stay in the loop.

Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.