Architectures and Circuits for Analog-Memory-Based Hardware Accelerators for Deep Neural Networks (Invited)

When and Where

Dec 5, 2023
9:00pm - 9:30pm

EL20-virtual

Presenter

Hsinyu Tsai

Co-Author(s)

Hsinyu Tsai¹

IBM Almaden Research Ctr¹

Abstract

Hsinyu Tsai¹

IBM Almaden Research Ctr¹

Analog non-volatile memory (NVM)-based accelerators for Deep Neural Networks (DNNs) can achieve high-throughput and energy-efficiency by computing multiply-accumulate (MAC) operations using Ohm’s law and Kirchhoff’s current law on arrays of resistive memory devices [1]. In recent years, energy-efficient, weight-stationary MAC operations in analog NVM memory-array “Tiles” were demonstrated in hardware with Phase Change Memory (PCM) devices integrated in the backend of 14-nm CMOS [2, 3]. Competitive end-to-end DNN accuracies can be obtained with the help of hardware aware training, accurate weight programming, and sufficiently linear MAC operations in the analog domain [4]. In this paper, I describe architectural and circuit advances for such Analog NVM-based accelerators and specialized digital compute units, designed to accelerate Transformer, Long- Short-Term-Memory (LSTM), and Convolution Neural Networks (CNNs). A highly heterogeneous and programmable accelerator architecture that takes advantage of a dense and efficient circuit-switched 2D mesh to exchange vectors of neuron-activation over short distances in a massively parallel fashion [5] is presented. Based on a 14-nm inference chip consisting of multiple arrays of PCM devices, the impact of memory materials on the accuracy and performance of these systems will be discussed. The author would like to thank colleagues at IBM Research Almaden, Yorktown, Albany, Zurich and Tokyo for their contributions to this work and the IBM Research AI HW Center. [1] G. W. Burr et al. “Ohm’s Law + Kirchhoff’s Current Law = Better AI: Neural- Network Processing Done in Memory with Analog Circuits will Save Energy”. In: IEEE Spectrum 58.12 (2021), pp. 44–49. [2] P. Narayanan et al. “Fully on-chip MAC at 14nm enabled by accurate row-wise programming of PCM-based weights and parallel vector-transport in duration format”. In: Symposium on VLSI Technology. 2021. [3] M. Le Gallo et al. “A 64-core mixed-signal in-memory compute chip based on phase- change memory for deep neural network inference”. In: arXiv:2212.02872 (2022). [4] M. J. Rasch et al. “Hardware-aware training for largescale and diverse deep learning inference workloads using in-memory computing-based accelerators”. In: arXiv preprint arXiv:2302.08469 (2023). [5] S. Jain et al. “A Heterogeneous and Programmable Compute-In-Memory Accelerator Architecture for Analog-AI Using Dense 2-D Mesh”. In: IEEE Trans. VLSI 31.1 (2023), pp. 114–127.

Symposium Organizers

Gina Adam, George Washington University

Sayani Majumdar, Tampere University

Radu Sporea, University of Surrey

Yiyang Li, University of Michigan

Symposium Support

Bronze
APL Machine Learning | AIP Publishing

Session Chairs

Gina Adam

Yiyang Li

In this Session

EL20.13.01
Architectures and Circuits for Analog-Memory-Based Hardware Accelerators for Deep Neural Networks (Invited)

Architectures and Circuits for Analog-Memory-Based Hardware Accelerators for Deep Neural Networks (Invited)

When and Where

Presenter

Co-Author(s)

Abstract

Symposium Organizers

Symposium Support

Session Chairs

In this Session

Publishing Alliance

Materials Research Society

Meetings & Events

Careers & Advancement

Publications & News

Advocacy & Policy

Programs & Outreach

About MRS