Netinfo Security

Data Augmentation Method via Large Language Model for Relation Extraction in Cybersecurity

LI Jiao, ZHANG Yuqing, WU Yabiao

2024, 24 (10): 1477-1483. doi: 10.3969/j.issn.1671-1122.2024.10.001

Abstract ( 736 )

HTML ( 1842 )

PDF (8545KB) ( 301 )

Relationship extraction technology can be used for threat intelligence mining and analysis, providing crucial information support for network security defense. However, relationship extraction tasks in cybersecurity face the problem of dataset deficiency. In recent years, large language model has shown its superior text generation ability, providing powerful technical support for data augmentation tasks. In order to compensate for the shortcomings of traditional data augmentation methods in terms of accuracy and diversity, this paper proposed a data augmentation method via large language model for relation extraction in cybersecurity named MGDA. MGDA used large language model to enhance the original data from four granularities of words, phrases, grammar, and semantics in order to ensure accuracy while improving diversity. The experimental results show that the proposed data augmentation method in this paper effectively improves the effectiveness of relationship extraction tasks in cybersecurity and diversity of generated data.

Figures and Tables | References | Related Articles | Metrics

Review of Encrypted Network Traffic Anonymity and Systemic Defense Tactics

WANG Qiang, LIU Yizhi, LI Tao, HE Xiaochuan

2024, 24 (10): 1484-1492. doi: 10.3969/j.issn.1671-1122.2024.10.002

Abstract ( 814 )

HTML ( 1298 )

PDF (12152KB) ( 765 )

Advanced persistent threat (APT) attacks with complex organization, efficient planning and clear directivity are one of the main threats facing our country, and the trend of covert action and regular attack of APT organizations is becoming more and more obvious. In recent years, it has become more and more difficult for our country to master the main APT activities, which is not unrelated to the fact that APT organizations disappear their attacks into normal information services and network activities, and hide their attack traffic in normal communication traffic. The state in which this kind of highly concealed attack behavior is concealed is called dense state. How to detect dense state behavior and implement system confrontation is one of the bottleneck problems to be solved in the current cyber space defense. From the perspective of clarifying the mechanism of traffic transmission hiding technology for advanced attack activities in cyberspace, this paper puts forward a research framework and countermeasure capability evaluation index system of traffic dense disappearing countermeasure based on two dimensions of anonymous communication link construction and traffic characteristic behavior detection, and comprehensively expounds the relevant research progress, research methods and solutions in recent years. In order to explore the new development direction of dense state countermeasure capability in cyberspace.

Figures and Tables | References | Related Articles | Metrics

Vulnerability Causation Analysis Based on Dynamic Execution Logging and Reverse Analysis

SHEN Qintao, LIANG Ruigang, WANG Baolin, ZHANG Jingcheng, CHEN Kai

2024, 24 (10): 1493-1505. doi: 10.3969/j.issn.1671-1122.2024.10.003

Abstract ( 514 )

HTML ( 618 )

PDF (15702KB) ( 584 )

Software vulnerabilities pose a great threat to software security, and there are numerous security incidents due to software vulnerabilities around the world every year. However, in the actual development process, due to the lack of security awareness of developers and the increasing complexity of code and business logic, it is difficult to avoid the existence of security vulnerabilities in software code. Aiming at the challenges of inaccurate error code positioning and inefficient analysis faced by the existing methods, this paper broke through the challenges of obtaining and reverse analysis of instruction runtime information and accurate positioning of error code, and proposed a method for locating the cause of program errors based on trace logs and reverse execution, which was capable of tracking the code execution flow of the program, recording the register state information and storage access state information of the instruction in the runtime state, and analyzing the pointer associated with the pointer that triggered the execution error. It can track the code execution flow of the program, record the register state information and storage access state information in the running state of the instruction, analyze the set of instructions that generate, use, and compute the pointer value associated with the pointer that triggers the execution error, and realize the efficient and accurate vulnerability cause analysis and localization.

Figures and Tables | References | Related Articles | Metrics

Mining Traffic Detection Method Based on Global Feature Learning

WEI Jinxia, HUANG Xizhang, FU Yuhao, LI Jing, LONG Chun

2024, 24 (10): 1506-1514. doi: 10.3969/j.issn.1671-1122.2024.10.004

Abstract ( 502 )

HTML ( 455 )

PDF (11223KB) ( 100 )

Mining traffic detection is a variable-length data classification task. Existing detection schemes, such as keyword matching and N-gram feature signatures, which are based on local feature classification methods, fail to fully utilize the global features of traffic. By employing deep learning models to model mining traffic, global features within the mining traffic are extracted to enhance the accuracy of mining traffic detection. The traffic classification model proposed in the article first employed a Transformer encoder to extract global features of the traffic, followed by a sequence summarizer to process the encoded results, obtaining a fixed-length representation for classification. Due to the mining samples accounting for less than 3% in the dataset, using accuracy to measure the classification effect of the model leads to significant bias. Therefore, the article comprehensively considered the precision and recall of the model, and employed the F1 score to evaluate the classification performance. Utilizing sinusoidal positional encoding in the model’s encoder enables the model to achieve an F1 score of 99.84% on the test set, with a precision rate of 100%.

Figures and Tables | References | Related Articles | Metrics

Privacy Computing in Environmental Big Data on Blockchain

WANG Nan, YUAN Ye, YANG Haoran, WEN Zhouzhi, SU Ming, LIU Xiaoguang

2024, 24 (10): 1515-1527. doi: 10.3969/j.issn.1671-1122.2024.10.005

Abstract ( 410 )

HTML ( 455 )

PDF (16467KB) ( 58 )

In recent years, with the continuous introduction of policies related to security, healthcare and environmental protection in China, the value of environmental data is increasingly important. However, the scientific management and secure sharing of environmental data in China is still in its infancy, and the amount of environmental data with privacy protection needs has increased a lot. However, there are many problems in data sharing, such as isolation of data centers and high risk of privacy leakage. Aiming at the environmental data sharing scenario and satisfying the privacy protection needs of the users’ data cloud storage and cloud computing, the article combine blockchain with privacy computing to construct a data security management system based on fully homomorphic encryption and searchable encryption by using the national cryptography algorithms. Relying on blockchain deployment, cloud service storage and computing support, the system can realize two functions: machine learning supported by fully homomorphic encryption and searchable encryption. Based on a fully homomorphic encryption scheme, this article implemented a neural network prediction model and completed privacy computing on encrypted data. Moreover, this article realized a symmetric searchable encryption scheme, and the encrypted data can be stored in the cloud during the whole process, achieving ciphertext retrieval while protecting the privacy of query keywords. This system is deployed on the EOS blockchain with a weakly centralized third party, where the blockchain provides a reliable platform for data transactions and digital evidences, and the IPFS(Inter Planetary File System) provides a safe custody platform for encrypted data storage. As a result, the data circulation channels of all participants are effectively connected while ensuring both privacy and availability of encrypted data.

Figures and Tables | References | Related Articles | Metrics

Survey on Fuzzing Test in Deep Learning Frameworks

ZHANG Zihan, LAI Qingnan, ZHOU Changling

2024, 24 (10): 1528-1536. doi: 10.3969/j.issn.1671-1122.2024.10.006

Abstract ( 511 )

HTML ( 686 )

PDF (11289KB) ( 117 )

With the widespread application of deep learning technology in various fields, ensuring the security and stability of its frameworks has become crucial. This paper starts from the user’s perspective to analyze the types of vulnerabilities that different user groups may encounter and the corresponding fuzzing test methods. The article first introduced the development background and importance of deep learning frameworks, then discussed in detail the current state of testing research for model libraries, deep learning frameworks, and compilers, and reviewed key techniques such as model mutation, weight generation, sample construction, and model testing. Then the article analyzed the root cause of bug in PyTorch and MLIR. Finally, the article looked forward to future research directions, including error localization and automatic repair techniques, as well as fuzzing test enhanced by large language models.

Figures and Tables | References | Related Articles | Metrics

Fingerprint Feature Extraction of Electronic Medical Records Based on Few-Shot Named Entity Recognition Technology

WANG Yaxin, ZHANG Jian

2024, 24 (10): 1537-1543. doi: 10.3969/j.issn.1671-1122.2024.10.007

Abstract ( 320 )

HTML ( 654 )

PDF (8909KB) ( 197 )

With the promulgation and implementation of the “Personal Information Protection Law of the People’s Republic of China” “Data Security Law of the People’s Republic of China” and other relevant laws and regulations, electronic medical record data protection has attracted much attention. Fast and efficient identification of electronic medical records is the first link of data protection and an important research topic in the field of data security. This paper proposed an electronic medical record fingerprint feature extraction method based on few-shot named entity recognition technology. First, the encoder was trained through a public dataset to obtain a broad text feature space. Subsequently, the encoder was fine-tuned using the electronic medical record dataset, and the entity type label was characterized by a prototype network. Finally, the fingerprint feature of “entity type + entity set” was obtained by extracting the electronic medical record feature. The experimental results show that the method has excellent performance on the I2B2 dataset, surpassing other models and effectively improving the privacy protection ability of electronic medical record dataset.

Figures and Tables | References | Related Articles | Metrics

The Research on Efficient Web Fuzzing Technology Based on Graph Isomorphic Network

ZHANG Zhanpeng, WANG Juan, ZHANG Chong, WANG Jie, HU Yuyi

2024, 24 (10): 1544-1552. doi: 10.3969/j.issn.1671-1122.2024.10.008

Abstract ( 430 )

HTML ( 680 )

PDF (11932KB) ( 63 )

Existing Web fuzzing methods mainly include dictionary-based black-box testing methods and borrow gray-box testing methods from binary fuzzing. These methods have the disadvantages of high randomness and low efficiency. In response to the above issues, the article proposed an efficient Web fuzzing method based on graph isomorphism network. Firstly, leveraging the powerful capabilities of graph isomorphism network in graph representation and structure learning, the semantic and structural features of vulnerabilities were learnt on the control flow graph of the code, and the probabilities of basic block vulnerabilities were predicted. Then, based on the vulnerability prediction results, a Web application fuzzing guidance strategy with dual guidance of vulnerability probability that consider both vulnerability probability and coverage. It prioritized the exploration of program locations with higher vulnerability possibilities without compromising coverage, effectively addressing the issues of high randomness and low efficiency in existing Web application fuzzing tools. Finally, based on the above methods, a prototype system was implemented and experimentally evaluated. The experimental results show that the efficiency of the system has increased by 40%, and the coverage has expanded by 5%.

Figures and Tables | References | Related Articles | Metrics

A Survey of Ownership Protection Schemes for Federated Learning Models

SA Qirui, YOU Weijing, ZHANG Yifei, QIU Weiyang, MA Cunqing

2024, 24 (10): 1553-1561. doi: 10.3969/j.issn.1671-1122.2024.10.009

Abstract ( 422 )

HTML ( 472 )

PDF (10676KB) ( 61 )

In recent years, machine learning has emerged as a key technology driving development across various industries. Federated learning has achieved enhancements in both model generalization and data privacy protection in distributed secure multi-party machine learning by integrating local data training with online gradient iteration. Due to the high training costs associated with federated learning models, including computational power and datasets, protecting the ownership of these economically valuable models has become particularly important. This article surveyed existing ownership protection schemes for federated learning models. The researchers examined two fingerprinting schemes, eight black-box watermarking schemes, and five white-box watermarking schemes to analyze the current state of research on model ownership protection. Additionally, this article combined methods for protecting the ownership of deep neural network models and provided insights into the current research directions for protecting the ownership of federated learning models.

Figures and Tables | References | Related Articles | Metrics

A Data-Free Personalized Federated Learning Algorithm Based on Knowledge Distillation

CHEN Jing, ZHANG Jian

2024, 24 (10): 1562-1569. doi: 10.3969/j.issn.1671-1122.2024.10.010

Abstract ( 858 )

HTML ( 681 )

PDF (8704KB) ( 113 )

Federated learning algorithms usually face the problem of huge differences between clients, and these heterogeneities degrade the global model performance, which are mitigated by knowledge distillation approaches. In order to further liberate public data and improve the model performance, DFP-KD trained a robust federated learning global model using datad-free knowledge distillation methods; used ReACGAN as the generator part; and adopted a step-by-step EMA fast updating strategy, which speeded up the update rate of the global model while avoiding catastrophic forgetting. Comparison experiments, ablation experiments, and parameter value influence experiments show that DFP-KD is more advantageous than the classical data-free knowledge distillation algorithms in terms of accuracy, stability, and update rate.

Figures and Tables | References | Related Articles | Metrics

A Random Walk Based Black-Box Adversarial Attack against Graph Neural Network

LU Xiaofeng, CHENG Tianze, LONG Chengnian

2024, 24 (10): 1570-1577. doi: 10.3969/j.issn.1671-1122.2024.10.011

Abstract ( 376 )

HTML ( 666 )

PDF (12152KB) ( 45 )

Graph neural networks have achieved remarkable success on many graph analysis tasks. However, recent studies have unveiled their susceptibility to adversarial attacks.The existing research on black box attacks often requires attackers to know all the training data of the target model, and is not applicable in scenarios where attackers have difficulty obtaining feature representations of graph neural network nodes.This paper proposed a more strict black-box attack model, where the attacker only possessed knowledge of the graph structure and labels of select nodes, but remained unaware of node feature representations. Under this attack model, this paper proposed a black-box adversarial attack method against graph neural networks. The approach approximated the influence of each node on the model output and identified optimal perturbations with greedy strategy. Experiments show that though less information is available, the attack success rate of this algorithm is close to that of the state-of-the-art algorithms, while achieving a higher attack speed. In addition, the attack method in this article also has migration and anti-defense capabilities.

Figures and Tables | References | Related Articles | Metrics

Defense Strategies against Poisoning Attacks in Semi-Asynchronous Federated Learning

WU Lizhao, WANG Xiaoding, XU Tian, QUE Youxiong, LIN Hui

2024, 24 (10): 1578-1585. doi: 10.3969/j.issn.1671-1122.2024.10.012

Abstract ( 680 )

HTML ( 574 )

PDF (9193KB) ( 65 )

Due to its distributed nature, federated learning(FL) is vulnerable to model poisoning attacks, where malicious clients can compromise the accuracy of the global model by sending tampered model updates. Among various FL branches, semi-asynchronous FL, with its lower real-time requirements, is particularly susceptible to such attacks. Currently, the primary means of detecting malicious clients involves analyzing the statistical characteristics of client updates, yet this approach is inadequate for semi-asynchronous FL. The noise introduced by delays in stale updates renders existing detection algorithms unable to distinguish between benign stale updates from clients and malicious updates from attackers. To address the issue of malicious client detection in semi-asynchronous FL, this paper proposed a detection method called SAFLD based on predicting model updates. By leveraging the historical updates of the model, SAFLD predicted stale updates from clients and assesses a maliciousness score, with higher-scoring clients being flagged as malicious and removed. Experimental validation on two benchmark datasets demonstrates that, compared to existing detection algorithms, SAFLD can more accurately detect various state-of-the-art model poisoning attacks in the context of semi-asynchronous FL.

Figures and Tables | References | Related Articles | Metrics

The Formal Analysis of SIP Protocol Based on the Recursive Authentication Test

YAO Mengmeng, WANG Yu, HONG Yuping

2024, 24 (10): 1586-1594. doi: 10.3969/j.issn.1671-1122.2024.10.013

Abstract ( 304 )

HTML ( 352 )

PDF (10148KB) ( 39 )

The article took the formal analysis method to prove the security protocol security as the research purpose, and took the SIP protocol with the characteristics of flexibility, openness and scalability as the research object. The article employed a formal analysis approach based on the improved recursive authentication test within the framework of strand space theory. It scrutinized a SIP authentication negotiation protocol that had been proven secure using BAN logic, revealing inaccuracies in protocol format and vulnerabilities to man-in-the-middle attacks during its execution. Subsequently, the article proposed a revised scheme tailored to address these identified deficiencies. The results indicate that the recursive authentication test formal analysis method employed in this article is more applicable and effective than BAN logic. Furthermore, the proposed improvements significantly enhance the security of the SIP authentication negotiation protocol.

Figures and Tables | References | Related Articles | Metrics

Systematic Risk Assessment Analysis for Smart Wearable Devices

ZHAO Ge, ZHENG Yang, TAO Zelin

2024, 24 (10): 1595-1603. doi: 10.3969/j.issn.1671-1122.2024.10.014

Abstract ( 466 )

HTML ( 369 )

PDF (12573KB) ( 104 )

Existing smart wearable devices generally have more vulnerable points and need to scientifically determine the risks they face through risk assessment. The current security risk assessment methods for smart wearable devices are mostly based on fragmented vulnerability points, without fully considering the systematic characteristics of the application scenarios of wearable devices, and are unable to assess the security risks as a whole. Therefore, the article proposed a risk assessment method for wearable devices based on a layered attack path diagram, which categorized the vulnerabilities of wearable devices according to their vulnerabilities’ location in the system, drew a multi-layer vulnerability relationship diagram, added direct threats and data asset targets facing the system to the diagram, and merged and calculated the attack paths from the direct threats, external vulnerability layer, indirect threats, to internal vulnerability layer attack target attack path for risk assessment. The proposed method takes the characteristics of system architecture into full consideration in the risk assessment process, which makes it easier and more accurate to assess the risk, and helps to find the bottlenecks of system security and evaluate the effectiveness of countermeasures.

Figures and Tables | References | Related Articles | Metrics

Research on ARP Spoofing Attack and Hardware Defense

HE Kaiyu, WANG Bin, YU Zhe, CHEN Fang

2024, 24 (10): 1604-1610. doi: 10.3969/j.issn.1671-1122.2024.10.015

Abstract ( 419 )

HTML ( 363 )

PDF (7834KB) ( 99 )

In view of the cumbersome configuration and high cost of the existing ARP spoofing attack defense methods, a hardware defense device based on FPGA was designed and tested in the real network environment. First, the real LAN environment was built, and the arpspoof tool was used to implement ARP spoofing attack on the target host in the LAN, and the target host couldn’t access the external network after being attacked. A network security protection device based on FPGA platform was designed to identify and filter ARP spoofing packets by analyzing the network packets in the upstream and downstream links and comparing them with the corresponding packet fields of the security protection policy. Finally, the network security protection device was connected to the LAN, and the ILA of VIVADO captured the related field waveforms of ARP spoofing attack packets. The waveform data shows that the network security device can effectively identify the MAC address and IP address of ARP spoofing attack packets and effectively intercept them. The changes of network link bandwidth, attack interception rate, and system resource usage of the attacked host are also collected.

Figures and Tables | References | Related Articles | Metrics

Table of Content