Your browser doesn't support javascript.
loading
EfficientQ: An efficient and accurate post-training neural network quantization method for medical image segmentation.
Zhang, Rongzhao; Chung, Albert C S.
Afiliación
  • Zhang R; Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China. Electronic address: rzhangbe@connect.ust.hk.
  • Chung ACS; Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China. Electronic address: achung@cse.ust.hk.
Med Image Anal ; 97: 103277, 2024 Oct.
Article en En | MEDLINE | ID: mdl-39094461
ABSTRACT
Model quantization is a promising technique that can simultaneously compress and accelerate a deep neural network by limiting its computation bit-width, which plays a crucial role in the fast-growing AI industry. Despite model quantization's success in producing well-performing low-bit models, the quantization process itself can still be expensive, which may involve a long fine-tuning stage on a large, well-annotated training set. To make the quantization process more efficient in terms of both time and data requirements, this paper proposes a fast and accurate post-training quantization method, namely EfficientQ. We develop this new method with a layer-wise optimization strategy and leverage the powerful alternating direction method of multipliers (ADMM) algorithm to ensure fast convergence. Furthermore, a weight regularization scheme is incorporated to provide more guidance for the optimization of the discrete weights, and a self-adaptive attention mechanism is proposed to combat the class imbalance problem. Extensive comparison and ablation experiments are conducted on two publicly available medical image segmentation datasets, i.e., LiTS and BraTS2020, and the results demonstrate the superiority of the proposed method over various existing post-training quantization methods in terms of both accuracy and optimization speed. Remarkably, with EfficientQ, the quantization of a practical 3D UNet only requires less than 5 min on a single GPU and one data sample. The source code is available at https//github.com/rongzhao-zhang/EfficientQ.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Algoritmos / Procesamiento de Imagen Asistido por Computador / Redes Neurales de la Computación Límite: Humans Idioma: En Revista: Med Image Anal Asunto de la revista: DIAGNOSTICO POR IMAGEM Año: 2024 Tipo del documento: Article Pais de publicación: Países Bajos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Algoritmos / Procesamiento de Imagen Asistido por Computador / Redes Neurales de la Computación Límite: Humans Idioma: En Revista: Med Image Anal Asunto de la revista: DIAGNOSTICO POR IMAGEM Año: 2024 Tipo del documento: Article Pais de publicación: Países Bajos