DSIP: A Scalable Inference Accelerator for Convolutional Neural Networks

Cited 43 time in webofscience Cited 0 time in scopus
  • Hit : 837
  • Download : 0
This paper presents a scalable inference accelerator called a deep-learning specific instruction-set processor (DSIP) to support various convolutional neural networks (CNNs). For CNNs requiring a large amount of computations and memory accesses, a programmable inference system called master-slave instruction set architecture (ISA) is newly proposed to achieve high flexibility, processing speed, and energy efficiency. The master is responsible for sending and receiving feature maps in order to deal with neural networks in a scalable way, and the slave performs CNN operations, such as multiply accumulate, max pooling, and activation functions, on the features received from the master. The master-slave ISA maximizes computation speed by overlapping the off-chip data transmission and the CNN operations, and reduces power consumption by performing the convolution incrementally to reuse input and partial-sum data as maximally as possible. An inference system can be configured by connecting multiple DSIPs in a form of either 1-D or 2-D chain structure in order to enhance computation speed further. To evaluate the proposed accelerator, a prototype chip is implemented and evaluated for AlexNet. Compared to the state-of-the-art accelerator, the DSIP-based system enhances the energy efficiency by 2.17x.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2018-02
Language
English
Article Type
Article
Keywords

CHIP

Citation

IEEE JOURNAL OF SOLID-STATE CIRCUITS, v.53, no.2, pp.605 - 618

ISSN
0018-9200
DOI
10.1109/JSSC.2017.2764045
URI
http://hdl.handle.net/10203/240388
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 43 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0