Characterization of Memory Devices for Energy Efficient Analog In-Memory Neural Computing at the Edge

When and Where

May 10, 2022
8:30am - 9:00am

Hawai'i Convention Center, Level 3, 318A

Presenter

Matthew Marinella

Tianyao Xiao

Christopher Bennett

William Wahby

Robin Jacobs-Gedrim

David Hughart

Elliot Fuller

A. Talin

Sapan Agarwal

Co-Author(s)

Matthew Marinella^1,2,Tianyao Xiao¹,Christopher Bennett¹,William Wahby¹,Robin Jacobs-Gedrim¹,David Hughart¹,Elliot Fuller¹,A. Talin¹,Sapan Agarwal¹

Sandia National Laboratories¹,Arizona State University²

Abstract

Matthew Marinella^1,2,Tianyao Xiao¹,Christopher Bennett¹,William Wahby¹,Robin Jacobs-Gedrim¹,David Hughart¹,Elliot Fuller¹,A. Talin¹,Sapan Agarwal¹

Sandia National Laboratories¹,Arizona State University²

Deep neural networks are well suited for edge computing tasks, such as real time sensor processing and data fusion. However, they are computationally expensive and hard to implement in the SWaP (size, weight, and power) constrained environments encountered by most edge applications. Analog in-memory computing can potentially solve this challenge and process deep neural inference with millisecond latencies and milliwatt power draw, due to the two or more orders of magnitude energy efficiency improvement over state-of-the-art digital systems. Efficiency improvements are a result of holding neural network weights stationary in arrays of nonvolatile memory, and analog input data from sensors is formatted and applied directly to these weights. Each memory’s physical state (often conductance) represents a range of numerical values with precision limited by the memory device physics and peripheral circuitry. Unlike traditional CMOS processors, the accuracy of the neural network is directly dependent on the collective precision of these physical weights, creating new memory device requirements which differ significantly from traditional binary and multi-level cell memory schemes. To assess the neural network accuracy of candidate devices, we have created an electrical characterization and modeling framework for analog in-memory computing. Determining deep neural net inference accuracy requires devices to be programmed to specific resistances with up to 128 levels, but significantly more overlap in spread is allowed than with a standard multi-level memory scheme. Accuracy is also strongly affected by how the precision varies across the 128 conductance levels, which directly interacts with the algorithm’s distribution of weight values. This analog characterization dataset can be used to statistically model the accuracy of an in-memory analog accelerator. We have characterized and modeled the potential accuracy of a commercial 40nm semiconductor-oxide-nitride-oxide-semiconductor (SONOS) technology, Sandia’s prototypical tantalum oxide resistive memory (ReRAM), and the emerging battery-inspired electrochemical memory (ECRAM). In the near term, embedded SONOS is an attractive option capable of high accuracy even on full deep neural networks, such as ImageNet. In the longer term, TaOx ReRAM promises a lower switching voltage, excellent scalability, ns switching, and high radiation resilience. However, the stochastic behavior associated with filamentary switching has created difficulty achieving high accuracy on highly complex deep network datasets such as ImageNet. Novel three terminal electrochemical memory (ECRAM) devices combine low voltage, low energy programming, and excellent analog tunability due to their bulk switching mechanism. Furthermore, fab-compatible metal oxide-based ECRAM devices use similar materials to ReRAM. In addition to precise initial weight values, edge computing typically requires the stable retention of weight values over time and across temperature ranges. We have explored the impact of weight drift on neural network accuracy, which has allowed us to understand how often refresh cycles might be needed. Finally, we have explored the effect of ionizing radiation which can be encountered in edge computing applications in space. The eventual degradation of neural network accuracy has been quantified as a function of total ionizing dose, and the effect of physical weight mapping on this degradation will be discussed. SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525.