GCD2: A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPs

Wei Niu, Jiexiong Guan, Xipeng Shen, Yanzhi Wang, Gagan Agrawal, Bin Ren

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

More specialized chips are exploiting available high transistor density to expose parallelism at a large scale with more intricate instruction sets. This paper reports on a compilation system GCD2, developed to support complex Deep Neural Network (DNN) workloads on mobile DSP chips. We observe several challenges in fully exploiting this architecture, related to SIMD width, more complex SIMD/vector instructions, and VLIW pipeline with the notion of soft dependencies. GCD2 comprises the following contributions: 1) development of matrix layout formats that support the use of different novel SIMD instructions, 2) formulation and solution of a global optimization problem related to choosing the best instruction (and associated layout) for implementation of each operator in a complete DNN, and 3) SDA, an algorithm for packing instructions with consideration for soft dependencies. These solutions are incorporated in a complete compilation system that is extensively evaluated against other systems using 10 large DNN models. Evaluation results show that GCD2 outperforms two product-level state-of-the-art end-to-end DNN execution frameworks (TFLite and Qualcomm SNPE) that support mobile DSPs by up to 6.0 × speedup, and outperforms three established compilers (Halide, TVM, and RAKE) by up to 4.5 ×, 3.4 × and 4.0 × speedup, respectively. GCD2 is also unique in supporting, real-time execution of certain DNNs, while its implementation enables two major DNNs to execute on a mobile DSP for the first time.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022
PublisherIEEE Computer Society
Pages512-529
Number of pages18
ISBN (Electronic)9781665462723
DOIs
StatePublished - 2022
Event55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022 - Chicago, United States
Duration: Oct 1 2022Oct 5 2022

Publication series

NameProceedings of the Annual International Symposium on Microarchitecture, MICRO
Volume2022-October
ISSN (Print)1072-4451

Conference

Conference55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022
Country/TerritoryUnited States
CityChicago
Period10/1/2210/5/22

Keywords

  • compiler optimization
  • deep neural network
  • mobile devices
  • VLIW instruction packing

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'GCD2: A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPs'. Together they form a unique fingerprint.

Cite this