A quantized network processor for mixed-precision deep learning models based on enhanced Microscaling format

2026-02-25

Jing Feng, Mei Wen, Xin Ju, Zhuang Cao, Yang Guo,
A quantized network processor for mixed-precision deep learning models based on enhanced Microscaling format,
Journal of Systems Architecture,
Volume 168,
2025,
103585,
ISSN 1383-7621,
https://doi.org/10.1016/j.sysarc.2025.103585.
(https://www.sciencedirect.com/science/article/pii/S1383762125002577)
Abstract: With the rapid development of deep neural networks (DNNs), their computational and energy consumption demands have also increased rapidly, making quantization techniques crucial for efficient and sustainable deployment. Microscaling (MX) format is a state-of-the-art quantization technique that uses shared micro-exponents to reduce storage space and enhance computational efficiency. However, MX format suffers from precision degradation in low-mantissa values and a lack of efficient mixed-precision hardware support. This paper introduces a mixed-precision quantization processor with three key contributions: (1) a data format named Super Microscaling (SMX), which significantly improves accuracy, particularly under low-bit numerical representations; (2) efficient support for mixed-precision computations, enabling flexible adaptation to diverse numerical precision requirements; and (3) specialized accelerator architecture designed to leverage the characteristics of the SMX format, maximizing its efficiency benefits. Experimental results demonstrate that SMX outperforms MX, improving accuracy by 55.86%, 40.89%, and 37.51% on average for BERT, ResNet50, and VGG models, respectively, while also reducing hardware area and power by 23.62% and 20.43%, proving its overall superiority.
Keywords: Microscaling format; Mixed-precision; Quantization; Systolic array