Maximum output discrepancy computation for convolutional neural network compression

Zihao Mo; Weiming Xiang

doi:10.1016/j.ins.2024.120367

Maximum output discrepancy computation for convolutional neural network compression

Zihao Mo, Weiming Xiang

Computer Science

Research output: Contribution to journal › Article › peer-review

Abstract

Network compression methods minimize the number of network parameters and computation costs while maintaining desired network performance. However, the safety assurance of many compression methods is based on a large amount of experimental data, whereas unforeseen incidents beyond the experiment data may result in unsafe consequences. In this work, we developed a discrepancy computation method for two convolutional neural networks by giving a concrete value to characterize the maximum output difference between the two networks after compression. Using Imagestar-based reachability analysis, we propose a novel method to merge the two networks to compute the difference. We illustrate reachability computation for each layer in the merged network, such as the convolution, max pooling, fully connected, and ReLU layers. We apply our method to a numerical example to prove its correctness. Furthermore, we implement our developed methods on the VGG16 model with the Quantization Aware Training (QAT) compression method; the results show that our approach can efficiently compute the accurate maximum output discrepancy between the original neural network and the compressed neural network.

Original language	English (US)
Article number	120367
Journal	Information Sciences
Volume	665
DOIs	https://doi.org/10.1016/j.ins.2024.120367
State	Published - Apr 2024

Keywords

Convolutional neural network
Discrepancy computation
Neural network compression
Reachability analysis

ASJC Scopus subject areas

Software
Control and Systems Engineering
Theoretical Computer Science
Computer Science Applications
Information Systems and Management
Artificial Intelligence

Access to Document

10.1016/j.ins.2024.120367

Cite this

@article{e352007929d843598b0a56e883dc8fa9,

title = "Maximum output discrepancy computation for convolutional neural network compression",

abstract = "Network compression methods minimize the number of network parameters and computation costs while maintaining desired network performance. However, the safety assurance of many compression methods is based on a large amount of experimental data, whereas unforeseen incidents beyond the experiment data may result in unsafe consequences. In this work, we developed a discrepancy computation method for two convolutional neural networks by giving a concrete value to characterize the maximum output difference between the two networks after compression. Using Imagestar-based reachability analysis, we propose a novel method to merge the two networks to compute the difference. We illustrate reachability computation for each layer in the merged network, such as the convolution, max pooling, fully connected, and ReLU layers. We apply our method to a numerical example to prove its correctness. Furthermore, we implement our developed methods on the VGG16 model with the Quantization Aware Training (QAT) compression method; the results show that our approach can efficiently compute the accurate maximum output discrepancy between the original neural network and the compressed neural network.",

keywords = "Convolutional neural network, Discrepancy computation, Neural network compression, Reachability analysis",

author = "Zihao Mo and Weiming Xiang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Inc.",

year = "2024",

month = apr,

doi = "10.1016/j.ins.2024.120367",

language = "English (US)",

volume = "665",

journal = "Information Sciences",

issn = "0020-0255",

publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - Maximum output discrepancy computation for convolutional neural network compression

AU - Mo, Zihao

AU - Xiang, Weiming

PY - 2024/4

Y1 - 2024/4

N2 - Network compression methods minimize the number of network parameters and computation costs while maintaining desired network performance. However, the safety assurance of many compression methods is based on a large amount of experimental data, whereas unforeseen incidents beyond the experiment data may result in unsafe consequences. In this work, we developed a discrepancy computation method for two convolutional neural networks by giving a concrete value to characterize the maximum output difference between the two networks after compression. Using Imagestar-based reachability analysis, we propose a novel method to merge the two networks to compute the difference. We illustrate reachability computation for each layer in the merged network, such as the convolution, max pooling, fully connected, and ReLU layers. We apply our method to a numerical example to prove its correctness. Furthermore, we implement our developed methods on the VGG16 model with the Quantization Aware Training (QAT) compression method; the results show that our approach can efficiently compute the accurate maximum output discrepancy between the original neural network and the compressed neural network.

AB - Network compression methods minimize the number of network parameters and computation costs while maintaining desired network performance. However, the safety assurance of many compression methods is based on a large amount of experimental data, whereas unforeseen incidents beyond the experiment data may result in unsafe consequences. In this work, we developed a discrepancy computation method for two convolutional neural networks by giving a concrete value to characterize the maximum output difference between the two networks after compression. Using Imagestar-based reachability analysis, we propose a novel method to merge the two networks to compute the difference. We illustrate reachability computation for each layer in the merged network, such as the convolution, max pooling, fully connected, and ReLU layers. We apply our method to a numerical example to prove its correctness. Furthermore, we implement our developed methods on the VGG16 model with the Quantization Aware Training (QAT) compression method; the results show that our approach can efficiently compute the accurate maximum output discrepancy between the original neural network and the compressed neural network.

KW - Convolutional neural network

KW - Discrepancy computation

KW - Neural network compression

KW - Reachability analysis

UR - http://www.scopus.com/inward/record.url?scp=85186266087&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85186266087&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2024.120367

DO - 10.1016/j.ins.2024.120367

M3 - Article

AN - SCOPUS:85186266087

SN - 0020-0255

VL - 665

JO - Information Sciences

JF - Information Sciences

M1 - 120367

ER -

Maximum output discrepancy computation for convolutional neural network compression

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this