Netinfo Security ›› 2022, Vol. 22 ›› Issue (8): 81-89.doi: 10.3969/j.issn.1671-1122.2022.08.010

Previous Articles     Next Articles

Anomaly Detection of Imbalanced Data in Industrial Control System Based on GAN-Cross

GU Zhaojun1,2, LIU Tingting1,2, GAO Bing1,2, SUI He3()   

  1. 1. Information Security Evaluation Center, Civil Aviation University of China, Tianjin 300300, China
    2. College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
    3. College of Aeronautical Engineering, Civil Aviation University of China, Tianjin 300300, China
  • Received:2021-05-26 Online:2022-08-10 Published:2022-09-15
  • Contact: SUI He E-mail:hsui@cauc.edu.cn

Abstract:

Industrial control system anomaly detection has a class imbalance problem, which makes it difficult for general classifiers to accurately identify abnormal data. At present, for class imbalanced data, sampling methods are commonly used to achieve the balance of various types of data to improve the performance of the classifier. However, traditional sampling methods are sensitive to the characteristics of the data set, resulting in poor stability of the sampling effect and fluctuations in the accuracy of anomaly detection. Based on the generative adversarial network(GAN), this paper proposed a GAN-Cross sampling model. The model could learn the probability distribution of the target data and generate data with similar probability distributions, so as to achieve the sampling effect. At the same time, in order to achieve better feature extraction, this paper applied a cross layer in the generator and discriminator. Finally, the model was combined with four classic classifiers: random forest, K-nearest neighbor, Gaussian Naive Bayes, and support vector machine, and compared with other four conventional sampling methods on four public imbalanced data sets. Experimental results show that compared with traditional sampling methods, this model can significantly improve the anomaly detection performance of the classifier on imbalanced data.

Key words: industrial control system, imbalanced data, generative adversarial network, sampling method, anomaly detection

CLC Number: