信息网络安全 ›› 2016, Vol. 16 ›› Issue (8): 74-80.doi: 10.3969/j.issn.1671-1122.2016.08.012

• • 上一篇    下一篇

基于N-gram算法的恶意程序检测系统研究与设计

张家旺(), 李燕伟   

  1. 国家计算机网络应急技术处理协调中心,北京 100029
  • 收稿日期:2016-06-10 出版日期:2016-08-20 发布日期:2020-05-13
  • 作者简介:

    作者简介: 张家旺(1981—),男,北京,工程师,硕士,主要研究方向为恶意代码分析、漏洞发现与利用;李燕伟(1983—),男,河北,工程师,博士,主要研究方向为恶意代码分析、漏洞发现与利用。

  • 基金资助:
    国家自然科学基金[61402125]

Research and Design on Malware Detection System Based on N-gram Algorithm

Jiawang ZHANG(), Yanwei LI   

  1. National Computer Network Emergency Response Technical Team Coordination Center of China, Beijing 100029, China
  • Received:2016-06-10 Online:2016-08-20 Published:2020-05-13

摘要:

文章针对恶意程序检测中难以检测未知恶意程序等问题,提出了一种提取恶意程序语义特征的方法。该方法使用N-gram算法对提取的Android应用程序的权限和API特征建立语义特征序列,并对特征序列进行筛选处理,获得了更具代表性的行为特征序列。首先,为了增加特征的有效性,经验丰富的恶意程序分析专家为每个Android SDK中的API函数添加相应的权重,并使用出现频次和权重值重新计算N-gram序列中每个元素的特征值,从而构建了改进的N-gram序列模型。然后,使用多种机器学习算法进行分类检测,验证其有效性。实验结果表明,提取的特征及改进的N-gram算法可以有效检测Android平台上的恶意程序。

关键词: 机器学习, 恶意代码检测, N-gram, Android应用

Abstract:

It is difficult to detect malware detection of unknown malicious programs, Aiming at solving this problem, this paper proposes an approach for extracting the dynamic features of malicious code semantics. This method extracts the permissions and API features of Android application to set up the semantic feature sequence with the N-gram algorithm. With screening of the feature sequence, the behavior sequence becomes more representative. First, in order to increase the effectiveness of the characteristics, analysis of experienced malware experts for each Android API function in SDK to add the corresponding weights, and the use of frequency and the weight value of each element of the N-gram sequence characteristics of re-calculated values in order to build a N-gram series model improved. Then, using a variety of machine learning algorithms for classification and detection, verify its effectiveness. The experimental results show that the improved N-gram algorithm and features in this paper can effectively detect malicious programs under Android platform.

Key words: machine learning, malicious code detection, N-gram, Android application

中图分类号: