Netinfo Security ›› 2024, Vol. 24 ›› Issue (12): 1896-1910.doi: 10.3969/j.issn.1671-1122.2024.12.008

Previous Articles     Next Articles

Control Flow Transformation Based Adversarial Example Generation for Attacking Malware Detection GNN Model

LI Yixuan1, JIA Peng1(), FAN Ximing1, CHEN Chen2   

  1. 1. School of Cyber Science and Engineering, Sichuan University, ChengDu 610065, China
    2. China Electronics Technology Cyber Security Co., Ltd., Beijing 100048, China
  • Received:2024-07-09 Online:2024-12-10 Published:2025-01-10

Abstract:

The GNN(Graph Neural Network) detector based on control flow graphs has achieved significant results in the field of malware detection, being the current mainstream and most advanced method. Existing adversarial sample generation methods for GNN detection models targeting malware mainly achieve their goals by modifying the basic blocks or edge features of the control flow graph rather than altering the original binary program input to the model. These methods are limited in real-world scenarios, where attackers find it difficult to directly access the feature extraction process of the control flow graph or obtain the intermediate layer features of the model. This paper proposed an adversarial attack framework, IRAttack, that changes the control flow graph of a binary program by transforming the IR (Intermediate Representation) to efficiently generate adversarial samples against control flow graph-based GNN detection models. This paper modify the IR using three operations: inserting semantic NOP(No Operation) instructions, control flow flattening, and control flow obfuscation, to alter the node and structural features of the control flow graph extracted from the binary program. Additionally, This paper combine fuzz testing ideas to select the positions to be modified and the content to be added, thus more effectively generating samples that can mislead GNN detection models. This paper conducted experiments on 5472 benign samples and 5230 malicious samples, using two different feature extraction methods and three model architectures in pairwise combinations, resulting in six models as attack targets. Experimental results show that the average attack success rate of IRAttack, compared to SRLAttack and IMalerAttack under the same conditions, has increased by 46.39% and 62.69%, respectively.

Key words: adversarial attack, GNN, malware detection, control flow transformation

CLC Number: