王兴晟

个人信息Personal Information

教授   博士生导师   硕士生导师  

性别:男

在职信息:在职

所在单位:集成电路学院

学历:研究生(博士)毕业

学位:哲学博士学位

毕业院校:格拉斯哥大学

学科:微电子学与固体电子学

论文成果

当前位置: Chinese homepage >> 科学研究 >> 论文成果

Optimizing hardware-software co-design based on non-ideality in memristor crossbars for in-memory computing

点击次数:

论文类型:期刊论文

第一作者:江品锋

通讯作者:王兴晟

合写作者:宋丹哲,黄梦华,阳帆,王乐天,刘盼,缪向水

发表刊物:SCIENCE CHINA-Information Sciences

收录刊物:SCI

所属单位:华中科技大学

刊物所在地:中国

关键字:memristor crossbar, IR-drop, neural network, activation function, hardware-software co-design

发表时间:2024-11-27

摘要:Memristor crossbar, with its exceptionally high storage density and parallelism, enables efffcient Vector Matrix Multiplication (VMM), signiffcantly improving data throughput and computational efffciency. However, its analog computing is vulnerable to issues like IR-drop, device-to-device (D2D) variation, and Stuck-At-Fault (SAF), leading to a substantial decrease in the inference accuracy of neural networks deployed on crossbars. This work presents a hardware-software co-design approach tailored to deal with memristor crossbar non-ideality. We introduce an end-to-end Functional Array SimulaTor (FAST) for precise and ultra fast end-to-end training, mapping, and evaluation of neural networks on the memristor crossbar. Utilizing the sparsity of the memristor crossbar coefffcient matrix, it achieves simulation with low storage and computational resource requirements, dynamically selecting the optimal solution to complete the process. It can also precisely simulate the impact of non-ideal effects such as IR-drop, retention, variation, SAF, and AD/DA precision. Using FAST, we assess memristor crossbar matrix operations under non-ideal conditions, identifying the max throughput and the most energy-efffcient crossbar conffgurations. Additionally, we propose a Comparator-based Activation Function Modulation (CAFM) scheme and its corresponding hardware architecture with programmable activation function circuits to address the IR-drop issue, enabling low power and area overheads, resulting in the recovery of neural network accuracy by 54% or more. This is validated within FAST, demonstrating the success of our hardware-software optimization co-design.