Netinfo Security ›› 2025, Vol. 25 ›› Issue (10): 1589-1603.doi: 10.3969/j.issn.1671-1122.2025.10.010

Previous Articles     Next Articles

Binary Code Similarity Detection Method Based on Multivariate Semantic Graph

ZHANG Lu, JIA Peng(), LIU Jiayong   

  1. School of Cyber Science and Engineering, Sichuan University, Chengdu 610207, China
  • Received:2024-06-05 Online:2025-10-10 Published:2025-11-07
  • Contact: JIA Peng E-mail:pengjia@scu.edu.cn

Abstract:

Binary code similarity detection is the basis for applications such as code cloning, vulnerability search, and software theft detection. However, binary codes lose the rich semantic information of the source code after compilation, while these codes often lack effective feature representation due to the diversity of the compilation process. To address this challenge, this paper proposed an innovative similarity detection architecture-SiamGGCN, which fused gated graph neural networks and attention mechanisms, and creatively introduced a multivariate semantic graph, which effectively combined the control flow information, sequence flow information and data flow information of assembly language, and provided a more accurate and comprehensive semantic parsing for similarity detection of binary codes. In this paper, the proposed method was experimentally validated on multiple datasets and a wide range of scenarios. The experimental results show that SiamGGCN significantly outperform the existing methods in terms of precision and recall, which fully demonstrates its superior performance and application potential in the field of binary code similarity detection.

Key words: code similarity, binary analysis, graph neural networks, graph embedding

CLC Number: