Netinfo Security ›› 2023, Vol. 23 ›› Issue (2): 85-95.doi: 10.3969/j.issn.1671-1122.2023.02.010

Previous Articles     Next Articles

Static Detection Method of Android Adware Based on Improved Random Forest Algorithm

HU Zhijie1, CHEN Xingshu2(), YUAN Daohua1, ZHENG Tao2   

  1. 1. School of Computer Science, Sichuan University, Chengdu 610065, China
    2. School of Cyber Science and Engineering, Sichuan University, Chengdu 610207, China
  • Received:2022-10-19 Online:2023-02-10 Published:2023-02-28
  • Contact: CHEN Xingshu E-mail:chenxsh@scu.edu.cn

Abstract:

Android adware shows advertisement in a disruptive way, and has the possibility to further transform into malware which posed a serious threat to user’s smartphone. The traditional adware detection method has high time costs and depends on dynamic feature of Android adware, making it difficult to respond to large-scale, high-precision detection requirements. To solve this problem, an Android adware static detection method based on improved random forest algorithm was proposed. Based on the characteristics of android adware, on the basis of traditional application programming interface, permission and intent, the third party library was included in the scope of feature selection. Statically decompile all the APK of adware collected in the dataset and extract the static information from them, and the static information was statistically analyzed to obtain the high-frequency information. After filtering this information, the base feature set was determined, and the static information in each APK was extracted and transforms into the feature vector, based on the idea of ensemble, used a variety of feature selection algorithms to joinly select features for model training and gave feature weights. Finally, the improved random forest algorithm based feature weights was used to improve the accuracy of the classifier, 5751 adware and 3465 non-adware application were selected for classification detection. The experimental results prove that the method has a faster speed while ensuring the accuracy.

Key words: Android, adware, static detection, machine learning

CLC Number: